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COMPOSITIONS POR THE DIAGNOSIS, PREVENTION, 
AKP TREATMENT OF TfTMOR PROQRggBTQM 

1. IMTRODPCTTOlt 

5 The present invention relates to compositions and 

methods for the diagnosis, prevention and treatment of 
tumor progression in mammals, for example, humans. The 
different types of tumors may include, but are not 
limited to, human melanomas, breast, gastrointestinal 
10 tumors such as esophageal, stomach, duodenal, colon, 
colorectal and rectal cancers, prostate, bladder, 
testicular, ovarian, uterine, cervical, brain, lung, 
bronchial, larynx, pharynx, liver, pancreatic, thyroid, 
bone, various types of skin cancers and neoplastic 
15 conditions such as leukemias and lymphomas. 

Specifically, genes which are differentially expressed in 
tumor cells relative to normal cells and/or relative to 
tumor cells at a different stage of tumor progression are 
identified. For example, genes are identified which are 
20 differentially expressed in benign ( e.g. . non-malignant) 
tumor cells relative to malignant tumor cells exhibiting 
a high metastatic potential. Genes are also identified 
via the ability of their gene products to interact with 
gene products involved in the progression to and/or 
25 aggressiveness of neoplastic tumor disease states. The 
genes identified can be used diagnostically or as targets 
for therapeutic intervention, in this regard, the 
present invention provides methods for the identification 
of compounds useful in the diagnosis, prevention and 
30 therapeutic treatment of tumor progression, including, 
for example, metastatic neoplastic disorders. The 
present invention also provides methods for the 
identification of compounds useful in the diagnosis, 
prevention and therapeutic treatment of tumor 
35 progression, including, for example, pre-neoplastic 

and/or benign stat s. Additionally, methods are provided 
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for the diagnostic evaluation and prognosis of conditions 
involving tumor progression, for the identification of 
subjects exhibiting a predisposition to such conditions, 
for monitoring patients undergoing clinical evaluation 
5 for the prevention and treatment of tumor progression 
disorders, and for monitoring the efficacy of compounds 
used in clinical trials. 

2 . rae yflHOUN p OF THE TWVEWTIQM 

Cancer is the second leading cause of death in the 
10 United States, after heart disease (Boring, C.c. et al., 
1993, CA cancer J. Clin. 41:7), and develops in one in 
three Americans, and one of every four Americans dies of 
cancer. Cancer is characterized primarily by an increase 
in the number of abnormal, or neoplastic, cells derived 
15 from a given normal tissue which proliferate to form a 
tumor mass, the invasion of adjacent tissues by these 
neoplastic tumor cells, and the generation of malignant 
cells which spread via the blood or lymphatic system to 
regional lymph nodes and to distant sites. The latter 
20 progression to malignancy is referred to as metastasis. 
Cancer can be viewed as a breakdown in the 
communication between tumor cells and their environment 
including their normal neighboring cells. Signals, both 
growth-stimulatory and growth- inhibitory, are routinely 
25 exchanged between cells within a tissue. Normally, cells 
do not divide in the absence of stimulatory signals, and, 
likewise, will cease dividing in the presence of 
inhibitory signals. In a cancerous, or neoplastic, 
state, a cell acquires the ability to "override" these 
30 signals and to proliferate under conditions in which 
normal cells would not grow. 

Tumor cells must acquire a number of distinct 
aberrant traits to pr lif rate. Reflecting this 
requirem nt is the fact that th genomes of certain well- 
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studied tumors carry several different independently 
alt red g n s, including activat d oncogenes and 
inactivated tumor suppressor genes. Each of these 
genetic changes appears to be responsible for imparting 
some of the traits that, in aggregate, represent the full 
neoplastic phenotype (Land, H. et al., 1983, Science 
222:771; Ruley, H.E., 1983, Nature 3J2l:602; Hunter, T. 
1991, Cell 54:249) . ' 

in addition to unhindered cell proliferation, 
cells must acquire several traits for tumor progression 
to occur. For example, early on in tumor progression, 
cells must evade the host immune system. Further, as 
tumor mass increases, the tumor must acquire vasculature 
to supply nourishment and remove metabolic waste. 
15 Additionally, cells must acquire an ability to invade 

adjacent tissue, and, ultimately, cells often acquire the 
capacity to metastasize to distant sites. 

The biochemical basis for immune recognition of 
tumor cells is unclear. It is possible that the 
20 tumorigenicity of cells can increase when the cells' 

display of Class I histocompatability antigens is reduced 
(Schrier, P.i. e t al., 1983, Nature 3^:771), in that 
these antigens, in conjunction with tumor-specific 
antigens are required for the tumor cells to be 
25 recognized by cytotoxic T lymphocytes (CTLs) . Tumor 
cells which have lost one or more genes encoding tumor- 
specific antigens seem to escape recognition by the 
corresponding reactive CTLs (Van der Bruggen, p. et al. 
1991, Science 254:1643). ' 
30 once a tumor reaches more than about 1 mm in 

diameter, it can no longer rely on passive diffusion for 
nutrition and removal of metabolic waste. At this point 
the tumor mass must make intimate contact with the 
circulatory system. Thus, cells within more advanced 
35 tumors secret angiogenic factors which promote 
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neovascularization, i.e. . the growth of blood vessels 
from surrounding tissue into the tumor mass (Folkman, J. 
and Klagsburn, M. , 1987, Science 225:442; Liotta, L.A. et 
al., 1991, Cell £±:327) . Among these angiogenic factors 
5 are the fibroblast growth factor (FGF) and endothelial 
cell growth factor (ECGF) . 

Neovascularization can, in fact, be an essential 
precursor to metastasis. First, the process is required 
for a large increase in tumor cell number, which in turn, 

10 allows the appearance of rare metastatic variants. 
Further, neovascularization provides a direct portal 
entry into the circulatory system which can be used by 
metastasizing cells. 

A variety of biochemical factors have been 

15 associated with different phases of metastases. Cell 
surface receptors for collagen, glycoproteins such as 
laminin, or proteoglycans, facilitate tumor cell 
attachment, an important step in invasion and metastases. 
Attachment then triggers the release of degradative 

20 enzymes which facilitate the penetration of tumor cells 
through tissue barriers. Once the tumor cell has entered 
the target tissue, specific growth factors are required 
for further proliferation. 

It is apparent that the complex process of tumor 

25 progression must involve multiple gene products. It is 
therefore important to define the role of specific genes 
involved in tumor progression, to identify those gene 
products involved in the tumor progression process and to 
further identify those gene products which can serve as 

30 therapeutic targets for the diagnosis, prevention and 
treatment of metastases of various forms of cancers. 

Some attempts have been made to study genes which 
are thought to elicit or augment tumor progression 
phenotyp s. Mutations may drive a wav of c llular 
3 5 multiplication associated with gradual increas s in tumor 
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size, disorganization and malignancy. For example, a 
mutation in the tumor suppressor gene which is a n gative 
regulator of cellular proliferation, results in a loss of 
crucial control over tumor growth and progression. 
5 Differential expression of the following suppressor genes 
has been demonstrated in human cancers: the 
retinoblastoma gene, RB; the Wilms' tumor gene, WT1 
(lip); the gene deleted in colon carcinoma, DCC (18q) ; 
the neurofibromatosis type l gene, NF1 (17q) ; and the 
10 gene involved in familial adenomatous polyposis coli, APC 
(5q) (Vogelstein, B. and Kinzler, K.W. , 1993, Trends 
Genet. 2:138-141). 

Insight into the complex events that lead from 
normal cellular growth to neoplasia, invasion and 
15 metastasis is crucial for the development of effective 
diagnostic and therapeutic strategies. The foregoing 
studies are aimed at defining the role of particular gene 
products presumed to be involved in tumor progression. 
However, such approaches cannot identify the full panoply 
20 of gene products that are involved in the cascade of 
steps in tumor progression. A great need, therefore, 
exists for the successful identification of those genes 
which are differentially expressed in cells involved in 
or predisposed to a tumor progression phenotype. Such 
25 differentially expressed gene and/or gene products can 
represent useful diagnostic markers and/or therapeutic 
targets for tumor progression disorders, with respect to 
diagnostic techniques, such genes and/or gene products 
could represent useful markers for the diagnosis, 
JO especially early diagnosis, given the correlation between 
early diagnosis and successful cancer treatment. With 
respect to therapeutic treatments, such differentially 
expressed genes and/ or gene products could represent 
useful targets for therapeutic treatment of various forms 
15 of tumor progression disorders, including metastatic and 
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non-metastatic neoplastic disord rs f and for inhibiting 
the progression of pre-neoplastic 1 sions ( e.g . , 
hyperplastic lesions or other benign tumors) to malignant 
tumors . 

5 Differentially expressed genes involved in tumor 

metastasis have been identified using murine melanoma 
cell lines of varying metastatic potentials, N-nitroso- 
methylurea- induced rat mammary carcinomas, mammary 
carcinoma cell lines, human breast tumors and spontaneous 

10 colonic and intestinal tumors in mice (Steeg, P.S., et 

al., 1988, J. Natl* Cancer Inst. fi£:200-204 ; Qian, F. , et 
al., 1994, Cell 22:335-347; Leone, A., et al., 1991, 
£1:25-35; Zou, Z., et al., 1994, Science ££2: 526-529; and 
Fodde, R. , et al. , 1994, Proc. Natl. Acad. Sci. USA 

15 2128969-8973), 

3. SUMMARY OF THE INVENTION 
The present invention relates to compositions for 
diagnosis, prevention, and treatment of tumor 
progression. Specifically, murine and human genes are 

20 identified and described which are differentially 

expressed in tumor cells relative to normal cells and/or 
to tumor cells at a different stage of tumor progression. 
For example, genes are identified which are 
differentially expressed in benign ( e.g. . non-malignant) 

25 tumor cells relative to malignant, metastatic tumor 
cells. The modulation of the expression of the 
identified genes and/ or the activity of the identified 
gene products can be utilized therapeutically to treat 
disorders involving tumor progression, including, for 

3 0 example, metastatic disorders. As such, methods and 
compositions are described for the identification of 
novel therapeutic compounds for the inhibition of tumor 
progression and the treatment f tumor progression 
disorders, including m tastatic diseases. 
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Further, the identifi d genes and/or gene products 
can be used to identify cells exhibiting or predisposed 
to a disorder involving a tumor progression phenotype, 
thereby diagnosing individuals having, or at high risk 
5 for developing, such disorders. Additionally, the 
identified genes and/or gene products can be used to 
grade or stage identified tumor cells. Still further, 
the detection of the differential expression of 
identified genes can be used to devise treatments (for 

10 example, chemopr event ion) before the benign cells attain 
a malignant state. Still further, the detection of 
differential expression of identified genes can be used 
to design a preventive intervention in pre-neoplastic 
cells in individuals at high risk. 

15 The invention features a composition for detecting 

expression of a gene transcript or gene product of gene 
030 for use in a method of diagnosing tumor progression 
in a mammal , wherein a level of expression of the gene 
transcript or gene product lower in a test sample than in 

20 a control sample indicates a tumor progression state in 
the test sample. 

The invention also features a compound for use in 
a method of treating tumor progression in a mammal, 
wherein said compound increases the level of expression 

25 or activity of a gene transcript or gene product of gene 
030 in cells exhibiting a tumor progression state. The 
compound can also be used for the manufacture of a 
medicament for use in this method. 

"Tumor progression," as used herein, refers to any 

30 event which, first, promotes the transition of a normal, 
non-neoplastic cell to a cancerous, neoplastic one. Such 
events include ones which occur prior to the onset at 
neoplasia, and which predispose, or act as a step toward, 
the cell becoming neoplastic. These events can, for 

35 example, include ones which cause a normal c 11 to 
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xhibit a pre-n oplastic phenotype. Sec nd, such ev nts 
also include ones which bring about the transition f r a a 
pre-neoplastic state to a neoplastic one. Such events 
can, for example, include ones which promote two 
5 hallmarks of the neoplastic state, namely unhindered cell 
proliferation and/or tumor cell invasion of adjacent 
tissue. Third, tumor progression can include events 
which promote the transition of a tumor cell to a 
metastatic state. Within each state, (e t g, , pre- 
10 neoplastic, neoplastic and metastatic) the term "tumor 
progression" as used herein can also refer to the 
disorder severity or aggressiveness a cell exhibits 
relative to other cells within the same state. 

Because multiple tumor progression events occur as 
15 a cell progresses from normal to neoplastic and 

metastatic states, certain cells will have undergone a 
different set of such tumor progression events. As such, 
such cells are referred to herein as belonging to 
different "tumor progression stages." 
20 A "disorder involving tumor progression" or a 

"tumor progression disorder," as used herein, refers to 
the state of a cell or cells which have undergone or are 
in the process of undergoing a tumor progression event, 
as defined above. 
2 5 "Differential expression," as used herein, refers 

to both quantitative, as well as qualitative, differences 
in the genes' temporal and/or cellular expression 
patterns among, for example, normal and neoplastic tumor 
cells, and/or among tumor cells which have undergone 
30 different tumor progression events. Differentially 

expressed genes can represent "fingerprint genes," and/or 
"target genes." 

"Fingerprint gene," as used herein, refers to a 
differentially expressed gene whos expression pattern 
35 can be utilized as part of a prognostic r diagn sfcic 
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marker for the evaluation of a disorder involving tumor 
progression, or which, alternativ ly, can be used in 
methods for identifying compounds useful for the 
treatment of such disorders. For example, the effect of 
5 the compound on the fingerprint gene expression normally 
displayed in connection with disorders involving tumor 
progression can be used to evaluate the efficacy of the 
compound as a treatment for such a disorder, or can, 
additionally, be used to monitor patients undergoing 
10 clinical evaluation for the treatment of the disorder. 

"Fingerprint pattern," as used herein, refers to 
the pattern generated when the expression pattern of a 
series (which can range from two up to all the 
fingerprint genes which exist for a given state) of 
15 fingerprint genes is determined. A fingerprint pattern 
can be used in the same diagnostic, prognostic and 
compound identification methods as the expression of a 
single fingerprint gene. 

"Target gene," as used herein, refers to a 
20 differentially expressed gene involved in tumor 

progression such that modulation of the level of target 
gene expression or of target gene product activity can 
act to prevent and/or ameliorate symptoms of the tumor 
progression. Compounds that modulate the expression of 
25 the target gene or the activity of the target gene 
product can be used in the treatment of neoplastic 
diseases, including, for example, disorders involving the 
progression to a metastatic state, still further 
compounds that modulate the expression of the target gene 
30 or activity of the target gene product can be used in 
treatments to prevent benign cells from attaining a 
malignant state. still further, compounds that modulate 
the expression of the target gene or activity of the 
target gene product can be used to design a preventive 
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int rvention in pr -n oplastic cells in individuals at 
high risk. 

Further, "pathway genes" are defined via the 
ability of their products to interact with other gene 
5 products involved in tumor progression disorders. 
Pathway genes can also exhibit target gene and/ or 
fingerprint gene characteristics. 

The present invention includes the products of 
such fingerprint, target, and pathway genes, as well as 
10 antibodies to such gene products. Furthermore, the 
engineering and use of cell-based and/or animal-based 
models of tumor progression disorders, including 
disorders involving metastasis, to which such gene 
products can contribute, are described. 
15 The present invention also relates to methods for 

prognostic and diagnostic evaluation of tumor progression 
conditions, and for the identification of subjects 
containing cells predisposed to such conditions. 
Furthermore, the invention provides methods for 
20 evaluating the efficacy of therapies for disorders 
involving tumor progression, and for monitoring the 
progress of patients participating in clinical trials for 
the treatment of such diseases. 

The tumor progression disorders described herein 
25 can include disorders involved in the progression of such 
human cancers as, for example, human melanomas, breast, 
gastrointestinal, such as esophageal, stomach, colon, 
bowel, colorectal and rectal cancers, prostate, bladder, 
testicular, ovarian, uterine, cervical, brain, lung, 
30 bronchial, larynx, pharynx, liver, pancreatic, thyroid, 
bone, leukemias, lymphomas, and various types of sXin 
cancers . 

The invention also provides methods for the 
identification of compounds that modulate the express i n 
35 of g nes or the activity of gene products involved in 
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tumor progression , including th progression of 
metastatic neoplastic diseas s, as well as m thods for 
the treatment of such diseases* Such methods can, for 
example, involve the administration of such compounds to 
5 individuals exhibiting symptoms or markers of tumor 
progression, such as markers for metastatic neoplastic 
diseases. 

This invention is based, in part on systematic 
search strategies involving in vivo and in vitro 

10 paradigms of tumor progression, including the progression 
to metastatic disease, coupled with sensitive and high 
throughput gene expression assays, to identify genes 
differentially expressed in tumor cells relative to 
normal cells and /or relative to tumor cells at a 

15 different tumor progression stage. In contrast to 

approaches that merely evaluate the expression of a given 
gene product presumed to play a role in one or another of 
the various stages of tumor progression, such as, for 
example the progression to a metastatic disease process, 

20 the search strategies and assays used herein permit the 
identification of all genes, whether known or novel, 
which are differentially expressed in tumor cells 
relative to normal cells or relative to tumor cells at a 
different stage of tumor progression. 

25 This comprehensive approach and evaluation permits 

the discovery of novel genes and gene products, as well 
as the identification of an array of genes and gene 
products (whether novel or known) involved in novel 
pathways that play a major role in the disease pathology. 

3 0 Thus, the present invention makes possible the 

identification and characterization of targets useful for 
prognosis, diagnosis, monitoring, rational drug design, 
and/or other therapeutic intervention of tumor 
progression disorders, including disorders involving 

35 m tastasis. 
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The Example present d in Section 6, below, 
demonstrat s th successful use of tumor progression 
search strategies of the invention to identify genes 
which are differentially expressed within tumor cells 
5 relative to tumor cells at a different stage of tumor 

progression. Specifically, the Example identifies a gene 
which is differentially expressed in metastatic cell 
populations relative to benign, non-malignant tumor 
cells. 

10 This gene, referred to herein as the 030 gene 

(f omy030 in the mouse and fohy030 in humans) , is a novel 
gene which is expressed at a many-fold higher level in 
non-metastatic tumor cells relative to its expression in 
metastatic tumor cells. The gene appears in mice and has 

15 the cDNA sequence shown in Fig. 3 A and 3B (SEQ ID NO: 2) . 
A homologous gene, referred to herein as the fohy030 
gene, appears in humans and has the cDNA sequence shown 
in Fig. 5 (SEQ ID NO:6) . An alternative splice form of 
the human cDNA has the sequence shown in FIG. 6 (SEQ ID 

20 NO: 8). Unless stated expressly otherwise, any general 
reference to the 03 0 gene hereinafter refers to both the 
murine (fomy030) and human (fohy030) homologs of this 
gene • 

The identification of the 030 gene and the 
25 characterization of its expression in particular stages 
of metastatic spread provides, therefore, newly 
identified targets for the diagnosis, prevention, and 
treatment of tumor progression disorders, including 
metastatic neoplastic diseases. 
30 Its expression pattern indicates that the 030 gene 

product acts to inhibit tumor progression. For example, 
a reduction in the level of 030 gene expression 
correlates with an increase in a cell's metastatic 
pot ntial i.e. . a reduction of 030 gene product in tumor 
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cells can induce or predispos a cell to progress to a 
m tastatic state. 

Hence, any method which can bring about an 
increase in the amount of 030 gene product can inhibit or 
5 slow the progression to metastasis. In fact, it is 

possible that the 030 gene product exhibits general tumor 
inhibition properties. 

A cDNA clone of the murine homo log, designated 
fomy030, is described herein in FIGS. 3A and 3B (SEQ ID 
10 NO: 2) (nucleotide sequence and amino acid sequence) , and 
was derived from fomy030 mRNA. However, as used herein, 
fomy030 cDNA refers to any DNA sequence that encodes the 
amino acid sequence depicted in FIGS. 3A and 3B (SEQ ID 
NO:3> . 

15 A cDNA clone of the human homo log, designated 

fohy030, is shown in FIG. 5 (SEQ ID NO:6) (nucleotide 
sequence and amino acid sequence) . An alternative splice 
form of fohy030 is shown in FIG. 6 (SEQ ID NO: 8). Both 
were obtained using the entire mouse fomy030 cDNA as a 

2 0 probe. However, as used herein, fohy030 cDNA refers to 
any DNA sequence that encodes the amino acid sequences 
depicted in FIG, 5 (SEQ ID NO: 7) and FIG. 6 (SEQ ID 
NO:9) . 



3.1. PBKWITIPWg 

25 "Tumor progression , w as used herein, refers to any 

event which, first, promotes the transition of a normal, 
non-neoplastic cell to a cancerous, neoplastic one. Such 
events include ones which occur prior to the onset of 
neoplasia, and which predispose, or act as a step toward, 

30 the cell becoming neoplastic. These events can, for 
example, include ones which cause a normal cell to 
exhibit a pre-neoplastic phenotype. Second, such events 
also include ones which bring about the transition from a 
pre-neoplastic state to a neoplastic one. Such events 
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can, for example, include ones which promote unhindered 
cell proliferation and/or tumor c 11 invasion of adjac nt 
tissue, which are viewed as hallmarks of the neoplastic 
state. Third, tumor progression can include events which 
5 promote the transition of a tumor cell to a metastatic 
state. Within each state, (e^cu, pre-neoplastic, 
neoplastic and metastatic) the term "tumor progression" 
as used herein can also refer to the disorder severity or 
aggressiveness a cell exhibits. 
10 Because multiple tumor progression events occur as 

a cell progresses from a normal to neoplastic and 
metastatic states, certain cells will have undergone a 
different set of such tumor progression events. As such, 
such cells are referred to herein as belonging to 
15 different "tumor progression stages." 

A "disorder involving tumor progression" or a 
"tumor progression disorder," as used herein, refers to 
the state of a cell or cells which have undergone or are 
in the process of undergoing a tumor progression event, 

20 as defined above. 

"Differential expression," as used herein, refers 
to both quantitative, as well as qualitative differences 
in the genes' temporal and/or cellular expression 
patterns among, for example, normal and neoplastic tumor 

25 cells, and/ or among tumor cells which have undergone 
different tumor progression events. Differentially 
expressed genes can represent "fingerprint genes," and/or 

"target genes." 

"Fingerprint gene," as used herein, refers to a 

30 differentially expressed gene whose expression pattern 
can be utilized as part of a prognostic or diagnostic 
marker for the evaluation of tumor progression, or whxch, 
alternatively, can be used in methods for identifying 
compounds useful for the treatment of tumor progression. 

35 For xample, the eff ct of the compound on the 
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fingerprint gene expressi n normally displayed in 
conn ction with tumor progression can be us d to evaluat 
the efficacy of the compound as a treatment for tumor 
progression , or can, additionally, be used to monitor 
5 patients undergoing clinical evaluation for the treatment 
of tumor progression. 

"Fingerprint pattern," as used herein, refers to 
the pattern generated when the expression pattern of a 
series (which can range from two up to all the 

10 fingerprint genes which exist for a given state) of 

fingerprint genes is determined. A fingerprint pattern 
can be used in the same diagnostic, prognostic and 
compound identification methods as the expression of a 
single fingerprint gene. 

15 "Target gene," as used herein, refers to a 

differentially expressed gene involved in tumor 
progression such that modulation of the level of target 
gene expression or of target gene product activity can 
act to prevent and/ or ameliorate symptoms of the tumor 

20 progression. Compounds that modulate target gene 

expression or activity of the target gene product can be 
used in the treatment of tumor progression and tumor 
progression disorders, including, for example, disorders 
involving the progression to a metastatic state. 

25 Further, "pathway genes" are defined via the 

ability of their products to interact with other gene 
products involved in tumor progression. Pathway genes 
can also exhibit target gene and/or fingerprint gene 
characteristics . 

30 4. DESCRIPTIO K OF THE FIGURES 

FIG. 1 is a Northern blot confirming differential 
regulation of the 030 gene. Total RNA (12/xg/lane) 
obtained from Fl (lanes 1 and 3) and F10 (lanes 2 and 4) 
melanoma cell cultures was hybridized with a cDNA probe 

35 prepar d by random priming of reamplified romy030 band. 
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(See materials and methods below in Section 6.1.) • Th 
romy030 probe identifies an RNA band of approximat ly 3 
Jcb, corresponding to a fomy030 mRNA. 

FIG. 2 is a nucleotide sequence of romy030 band 

5 (SEQ ID NO:l) . 

FIGS. 3A and 3B are representations of the 
nucleotide and derived amino acid sequences of cDNA clone 
fomy030 (SEQ ID NOs:2 [nucleotide sequence] and 3 [amino 
acid sequence] ) derived from f omy030 mRNA. 
10 FIG . 4 is a Northern blot analysis confirming 

differential regulation of the fomy030 gene. Lane 1 is 
B16 Fl, lane 2 is B16 F10, and lanes 3-6 are B16 H5, B16 

H6, B16 H7 and B16 H8 . 

FIG. 5 is a representation of the nucletide and 
15 deduced amino acid sequences of cDNA clone of fohy030 
(SEQ ID NOs:6 [nucleotide sequence] and 7 [amino acid 

sequence] ) . 

FIG. 6 is a comparison of the nucletide and 
deduced amino acid sequences of another cDNA clone of 
20 fohy030 (SEQ ID NOs:8 [nucleotide sequence] and 9 [amino 

acid sequence]). 

in FIGS. 3A and 3B, the nucleotide sequence is 
numbered starting at the first nucleotide, whereas in 
FIGS. 5 and 6, the nucleotide sequence is numbered 
25 starting at the ATG start codon. 

5. p»T*TT,TO DESCRIPTION of THB IHYBH IIgH 
Methods and compositions for the prevention, 
treatment and diagnosis of tumor progression, including 
tumor progression involving metastatic disorders, in 
30 cells involved in human tumors. Such human tumors may 
include, for example, human melanomas, breast, 
gastrointestinal tumors such as esophageal, stomach, 
duodenal, colon, color ctal and rectal cancers, prostate, 
bladder, testicular, ovarian, uterin , cervical, brain, 
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lung, bronchial, larynx, pharynx, liv r, pancreatic, 
thyroid, bone, various types of skin cancers and other 
neoplastic conditions such as leukemias, lymphomas. The 
invention is based, in part, on the evaluation and 
5 expression and role of all genes that are differentially 
expressed in tumor cells relative to normal cells and/ or 
relative to tumor cells at a different stage of tumor 
progression. This permits the definition of disease 
pathways and identification of targets in such pathways 

10 that are useful for diagnosis, prevention and treatment 
of tumor progression, including the tumor progression 
disorders involving metastatic neoplastic diseases. 

Genes, termed "target genes" and/or "fingerprint 
genes" are described which are differentially expressed 

15 in tumor cells relative to their expression in normal 
cells or relative to their expression in tumor cells 
which are at a different stage of tumor progression. 
Additionally, genes, termed "pathway genes" are described 
whose gene products exhibit an ability to interact with 

20 gene products involved tumor progression, including tumor 
progression disorders involving metastatic neoplastic 
disorders. Pathway genes can additionally have 
fingerprint and/or target gene characteristics. Methods 
for the identification of such fingerprint, target, and 

25 pathway genes are also described. 

Further, the gene products of such fingerprint, 
target, and pathway genes are described in Section 5.2.2, 
antibodies to such gene products are described in Section 
5.2.3, as are cell-and animal-based models of tumor 

30 progression disorders to which such gene products can 
contribute, in Section 5.2.4. 

Methods for the identification of compounds which 
modulate the expression of genes or the activity of g^ene 
products involved in tumor progression are described in 

35 S ction 5.3. Methods for monitoring th efficacy of 
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compounds during clinical trials ar describ d in Section 
5.3.5. Additionally described, below, are m thods for 
treatment of tumor progression disorders, including 
metastatic diseases. 
5 Also discussed, below, are methods for prognostic 

and diagnostic evaluation of tumor progression and 
disorders involving tumor progression, including 
metastatic disorders, and, further, for the 
identification of subjects exhibiting a predisposition to 
10 such disorders. 

5.1. IDENTIFICATION OF DIFFERENTIALLY 
E XPRESSED GENES 

Described herein are methods for the 
identification of differentially expressed genes which 
15 are involved in tumor progression. There exist a number 
of levels or stages at which the differential expression 
of such genes can be exhibited. For example, 
differential expression can occur in tumor cells relative 
to normal cells, or in tumor cells within different 
20 stages of tumor progression. For example, genes can be 
identified which are differentially expressed in pre- 
neoplastic versus neoplastic cells. Such genes can 
include, for example, ones which promote unhindered cell 
proliferation or tumor cell invasion of adjacent tissue, 
25 both of which are viewed as hallmarks of the neoplastic 
state. Further, differential expression can occur in 
benign (e^, non-malignant) tumor cells versus 
metastatic, malignant tumor cells. Still further, 
differential expression can occur among cells within any 
30 one of these states (e^, pre-neoplastic, neoplastic and 
metastatic) , and can indicate, for example, a difference 
in tumor progression severity or aggressiveness of one 
cell relative to that of another cell within the same 
stat . 
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Methods for the identification of such 
differentially expressed genes are described, below, in 
Section 5*1.1. Methods for the further characterization 
of such differentially expressed genes, and for their 
5 categorization as target and/ or fingerprint genes, are 
presented/ below , in Section 5.3. 

"Differential expression* as used herein refers to 
both quantitative, as well as qualitative differences in 
the genes' temporal and/or tissue expression patterns. 
10 Thus, a differentially expressed gene can qualitatively 
have its expression activated or completely inactivated 
in, for example, normal versus tumor progression states, 
in cells within different tumor progression states or 
among cells within a single given tumor progression 
15 state. Such a qualitatively regulated gene will exhibit 
an expression pattern within a given state which is 
detectable by standard techniques in one such state, but 
is not detectable in both states being compared. 
"Detectable," as used herein, refers to an RNA expression 
20 level which is detectable via the standard techniques of 
differential display, RT (reverse transcriptase) -coupled 
PCR, Northern and/or RNase protection analyses. 

Alternatively, a differentially expressed gene can 
exhibit an expression level which differs, i.e. f is 
25 quantitatively increased or decreased in normal versus 

tumor progression states, in cells within different tumor 
progression states or among cells within a single given 
tumor progression state. 

The degree to which expression differs need only 
30 be large enough to be visualized via standard 

characterization techniques, such as, for example, the 
differential display technique described below. Other 
standard, well-known characterization techniques by which 
expression differences can be visualized include, but ar 
35 n t limit d to, quantitativ RT (r verse transcriptas )- 



BNSDOCID: <WO 9630389A1J_> 



WO 96/30389 



PCT/US96/04413 



- 20 - 



10 



coupled PCR and North rn analyses and RNase prot ction 
techniques . 

Differentially expressed genes can be further 
described as target genes and/or fingerprint genes. 
-Fingerprint gene," as used herein, refers to a 
differentially expressed gene whose expression pattern 
can be utilized as part of a prognostic or diagnostic 
marker in tumor progression evaluation, or which, 
alternatively, may be used in methods for identifying 
compounds useful for the prevention or treatment of tumor 
progression and tumor progression disorders, including 
metastatic disorders. A fingerprint gene can also have 
the characteristics of a target gene or a pathway gene 
(see below, in Section 5.2). 

-Fingerprint pattern," as used herein, refers to 
the pattern generated when the expression pattern of a 
series (which can range from two up to all the 
fingerprint genes which exist for a given state) of 
fingerprint genes is determined. A fingerprint pattern 
> can be used in the same diagnostic, prognostic and 
compound identification methods as the expression of a 
single fingerprint gene. 

-Target gene," as used herein, refers to a 
differentially expressed gene involved in tumor 
5 progression in a manner by which modulation of the level 
of target gene expression or of target gene product 
activity can act to prevent and/or ameliorate symptoms of 
disorders involving tumor progression. Tumor progression 
disorders include, for example, disorders involved in 
o human tumors, including, but not limited to human 

melanomas, breast, gastrointestinal, such as esophageal, 
stomach, colon, bowel, colorectal and rectal cancers, 
prostate, bladder, testicular, ovarian, uterine, 
cervical, brain, lung, bronchial, larynx, pharynx, liver, 
,5 pancreatic, thyroid, bone, leukemias, lymphomas and 
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various types of skin cancers. A target gen can als 
have the characteristics of a fingerprint gene and/ or a 
pathway gene (as described, below, in Section 5,2) • 

5.1.1. METHODS FOR THE IDENTIFICATION OF 
5 DIFFERENTI ALLY EXPRESSED GENES 

A variety of methods can be utilized for the 
identification of genes which are involved in tumor 
progression. Described in Section 5.1.1.1 are 
experimental paradigms which can be utilized for the 
10 generation of samples which can be used for the 

identification of such genes. Material generated in 
paradigm categories can be characterized for the presence 
of differentially expressed gene sequences as discussed, 
below, in Section 5.1.1.2. 

15 5.1.1.1. PARADIGMS FOR THE IDENTIFICATION 

OP DIFFERENTIALLY EXPRESSED GENES 

Paradigms which represent models of tumor 

progression states are described herein. These paradigms 

can be utilized for the identification of genes which are 

20 differentially expressed in normal cells versus cells in 
tumor progression states, in cells within different tumor 
progression states or among cells within a single given 
tumor progression state. 

The paradigms described herein include at least 

25 two groups of cells of a given cell type, preferably 
genetically matched cells ( e.g. . cells derived from 
variants of the same cell line, or cells derived from a 
single individual or biological sample) , whose expression 
patterns are compared and analyzed for differential 

30 expression. Methods for the analysis of paradigm 
material are described, below, in Section 5.1.1.2. 

Once a particular gene has been identified through 
the use of one paradigm, its expr ssion patt rn can b 
further character iz d, for example, by studying its 
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expr ssion in a diff rent paradigm. A gen can, for 
example, be regulated one way, L^, can exhibit one 
differential gene expression pattern, in a given 
paradigm, but can be regulated differently in another 

5 paradigm. The use, therefore, of multiple paradigms can 
be helpful in distinguishing the roles and relative 
importance of particular genes in tumor progression. 

In one embodiment of such a paradigm, referred to 
herein as the "i" vitro " paradigm, cell lines can be used 

10 to identify genes which are differentially expressed in 
tumor progression states. Differentially expressed genes 
are detected, as described herein, by comparing the 
pattern of gene expression between the experimental and 
control conditions. In such a paradigm, genetically 

15 matched tumor cell lines (e,q r , variants of the same cell 
line) are generally utilized. For example, the gene 
expression pattern of two variant cell lines can 
compared, wherein one variant exhibits characteristics of 
one tumor progression state while the other variant 

20 exhibits characteristics of another tumor progression 
state. Alternatively, two variant cell lines, both of 
which exhibit characteristics of the same tumor 
progression state, but which exhibit differing degrees of 
tumor progression disorder severity or aggressiveness. 

25 Further, genetically matched cell lines can be utilized, 
one of which exhibits characteristics of a tumor 
progression state, while the other exhibits a normal 

cellular phenotype. 

The variant cell lines utilized herein can exhibit 

30 such tumor progression characteristics as, for example, a 
high or low metastatic potential, which refers to the 
likelihood that a cell will give rise to a distant site, 
tumor mass. Alternatively, one or more such variant cell 
lin s can exhibit pre-neoplastic characteristics or can 

35 exhibit charact ristics generally associated with one or 



WO 96/30389 



PCIYUS96/04413 



more n oplastic cell phenotypes, such as, for exampl , 
cell proliferation or invasion phenotypes. 

In accordance with this aspect of the invention, 
the cell line variants are cultured under appropriate 
5 conditions, the cells are harvested, and RNA is isolated 
and analyzed for differentially expressed genes, as 
described in detail in Section 5.1.1.2, below. 

Examples of cell lines that can be used as part of 
such in vitro paradigms include but are not limited to 
10 variants of melanoma cell lines, such as, for example, 

the murine melanoma B16 Fl cell line which exhibits a low 
metastatic potential and the melanoma B16 F10 cell line 
which exhibits a high metastatic potential (Fidler, I. 
J., 1973, Nature New Biol 242:148-149); human colon cell 
15 lines, such as, for example KM12c (tumor cell line with 
low metastatic potential) and the KM20L4 (tumor cell line 
with high metastatic potential; Morikawa K. , et al. . 
1988, Cancer Research 48*1943-1948); prostatic tumor cell 
lines, such as, for example, DU 145 (non metastatic tumor 
20 cell line) and PC-3-M (high metastatic potential tumor 
cell line; Karmali, R.A. et al. . 1987, Anticancer Res. 
7:1173-1180, and Koziowski, J.M. et al. . 1984, Cancer 
Research &£i 3522-3529) ; and breast carcinoma tumor cell 
lines, such as, for example, MCF-7 (non metastatic tumor 
25 cell line) and MDA-MB-435 (high metastatic potential 

tumor cell line; Watts C.K. et al. . 1994, Breast Cancer 
Res. Treat. 21:95-105 and Rose, D.P. et al. . 1993, J. 
Natl. Cancer Inst. 85; 1743-1747) . 

As presented in the Example presented in Section 
30 6, below, this paradigm has been successfully utilized to 
identify a gene, referred to herein as the 030 gene, 
which is differentially expressed in cells exhibiting a 
high metastatic potential relative to cells exhibiting a 
low metastatic potential. Specifically, the 030 gene is 
3 5 xpressed at a many-fold high r level in low metastatic 
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potential cells relativ to cells exhibiting a high 
metastatic potential. 

In a second paradigm, referred to herein as the in 
vivo paradigm, animal models of tumor progression 
5 disorders can be utilized to discover differentially 
expressed gene sequences. The in vivo nature of such 
tumor progression models can prove to be especially 
predictive of the analogous responses in living patients. 

A variety of tumor progression animal models can 
10 be used for as part of the in vivo paradigms. For 
example, animal models of tumor progression may be 
generated by passaging tumor cells in animals (e t gt # 
mice) r leading to the appearance of tumors within these 
animals • 

15 Additional animal models, some of which may 

exhibit differing tumor progression characteristics, may 
be generated from the original animal models described 
above. For example, the tumors which result in the 
original animals can be removed and grown in vitro* 

20 Cells from these in vitro cultures can then be passaged 
in animals and tumors resulting from this passage can 
then be isolated. RNA from pre-passage cells, and cells 
isolated after one or more rounds of passage can then be 
isolated and analyzed for differential expression. The 

25 differential expression can be compared to the metastatic 
potential expression of such cells. These cells can now 
represent cells from different tumor progression states, 
or cells within a given tumor progression state 
exhibiting differing degrees of severity or 

30 aggressiveness. Such passaging techniques can utilizing 
any of the variant cell lines described, above, for the 
in vitro paradigms. 

Additionally, animal mod Is for tumor progr ssion 
which can be utilized f r such an in vivo paradigm 
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include any of the animal models describ d, b low, in 
Section 5.7.1. Other models includ transgenic mouse 
model for melanoma (Mintz, B. and Silvers, W.K. , 1993, 
Proc. Natl. Acad. Sci. USA 90:8817-88121, transgenic mice 
5 which carry specific adenomatous polyposis coli (APC) 
gene mutations (Fodde, R. , et al., 1994, Proc. Natl. 
Acad. Sci. USA 2^:8969-8973) and the transgenic mouse in 
which the mammary tumor virus I/TR/c-myc gene is 
anomalously expressed (Leder, A. , et al., 1986, Cells 

10 15:485-495). 

A third paradigm, referred to herein as the 
"specimen paradigm," utilizes samples from surgical and 
biopsy specimens. Such specimens can represent normal 
tissue, primary, secondary or metastasized tumors 

15 obtained from patients having undergone surgical 

treatment for disorders involving tumor progression such 
as, for example, melanomas, colon carcinomas, lung 
carcinomas, prostatic cancers and breast cancers. 

Surgical specimens can be procured under standard 

20 conditions involving freezing and storing in liquid 

nitrogen (see, for example, Karmali, R.A. , et al. . 1983, 
Br. J. cancer A£: 689-696.) RNA from specimen cells is 
isolated by, for example, differential centrif ugation of 
homogenized tissue, and analyzed for differential 

25 expression relative to other specimen cells, preferably 
cells obtained from the same patient. 

In paradigms designed to identify genes which are 
involved in tumor progression, compounds known to have an 
ameliorative effect on the tumor progression symptoms can 

30 also be used in paradigms to detect differentially 
expressed genes. Such compounds can include known 
therapeutics, as well as compounds that are not useful as 
therapeutics due to their harmful side effects. For 
example, tumor cells that are cultured as explained in 

35 this; Section, above, can be exposed to one of thes 
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compounds and analyzed for differential g ne expression 
with respect to untreated tumor cells, according to the 
methods described below in Section 5.1.1.2. In 
principle, however, according to the paradigm, any cell 
5 type involved in tumor progression and disorders thereof 
can be treated by these compounds at any stage of the 
tumor progression process. 

Cells involved in tumor progression can also be 
compared to unrelated cells ( e.g. . fibroblasts) which 

10 have been treated with the compound, such that any 
generic effects on gene expression that might not be 
related to the disease or its treatment may be 
identified. Such generic effects might be manifest, for 
example, by changes in gene expression that are common to 

15 the test cells and the unrelated cells upon treatment 
with the compound. 

By these methods, the genes and gene products upon 
which these compounds act can be identified and used in 
the assays described below to identify novel therapeutic 

2 0 compounds for inhibition of tumor progression and the 
treatment of tumor progression disorders, including 
metastatic diseases. 

5.1.1.2. ANALYSIS OF PARADIGM MATERIAL 
In order to identify differentially expressed 

25 genes, RNA, either total or mRNA, can be isolated from 
cells utilized in paradigms such as those described 
earlier in Section 5.1.1.1. Any RNA isolation technique 
which does not select against the isolation of mRNA can 
be utilized for the purification of such RNA samples. 

30 See, for example, Ausubel, F.M. et al. . eds. , 1987-1993, 
current Protocols in Molecular BiolocrY, John Wiley & 
Sons, Inc. New York. Additionally, large numbers of 
tissu samples can r adily be processed using techniques 
well known to thos of skill in the art, such as, for 
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example, the single-step RNA isolation process of 
Chomczynski, P. (1989, U.S. Patent No. 4,843,155). 

Transcripts within the collected RNA samples which 
represent RNA produced by differentially expressed genes 
5 can be identified by utilizing a variety of methods which 
are well known to those of skill in the art. For 
example, differential screening (Tedder, T.F. et al. . 
1988, Proc. Natl. Acad. Sci. USA 208-212), subtractive 
hybridization (Hedrick, S.M. et ah . 1984, Nature 

10 308 :149-153: Lee, S.W. et al. , 1984 , Proc. Natl. Acad. 
Sci. USA ££:2825), and, preferably, differential display 
(Liang, P. and Pardee, A.B. , 1993, U.S. Patent No. 
5,262,311), can be utilized to identify nucleic acid 
sequences derived from genes that are differentially 

15 expressed. 

Differential screening involves the duplicate 
screening of a cDNA library in which one copy of the 
library is screened with a total cell cDNA probe 
corresponding to the mRNA population of one cell type 

20 while a duplicate copy of the cDNA library is screened 
with a total cDNA probe corresponding to the mRNA 
population of a second cell type. For example, one cDNA 
probe can correspond to a total cell cDNA probe of a cell 
type or tissue derived from a control subject, while the 

25 second cDNA probe can correspond to a total cell cDNA 
probe of the same cell type derived from an experimental 
subject. Those clones which hybridize to one probe but 
not to the other potentially represent clones derived 
from genes differentially expressed in the cell type of 

30 interest in control versus experimental subjects. 

Subtractive hybridization techniques generally 
involve the isolation of mRNA taken from two different 
sources, e.g. . control and experimental tissue, the 
hybridization of the mRNA or single-strand d cDNA 

35 r verse-transcribed from the isolat d mRNA, and th 
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rem val of all hybridized, and therefore double-stranded , 
sequences. The remaining non-hybridized, single-stranded 
cDNAs, potentially represent clones derived from genes 
that are differentially expressed in the two mRNA 
5 sources. Such single-stranded cDNAs are then used as the 
starting material for the construction of a library 
comprising clones derived from differentially expressed 
genes . 

The differential display technique describes a 
10 procedure, utilizing the well-known polymerase chain 
reaction (PCR; the experimental embodiment set forth in 
Mullis, K.B., 1987, U.S. Patent No. 4,683,202) Which 
allows for the identification of sequences derived from 
genes which are differentially expressed. First, 
15 isolated RNA is reverse-transcribed into single-stranded 
cDNA, utilizing standard techniques which are well known 
to those of skill in the art. Primers for the reverse 
transcriptase reaction can include, but are not limited 
to, oligo dT-containing primers, preferably of the 3' 
20 primer type of oligonucleotide described below. Next, 
this technique uses pairs of PCR primers, as described 
below, which allow for the amplification of clones 
representing a random subset of the RNA transcripts 
present within any given cell. Utilizing different pairs 
25 of primers allows each of the mRNA transcripts present in 
a cell to be amplified. Among such amplified transcripts 
can be identified those which have been produced from 
differentially expressed genes. 

The 3' oligonucleotide primer of the primer pairs 
30 can contain an oligo dT stretch of 10-13 dT nucleotides 
at its 5' end, preferably 11, which hybridizes to the 
poly (A) tail of mRNA or to the complement of a cDNA 
reverse transcribed from an mRNA poly (A) tail. Second, 
in order to increase the specificity of th 3' primer, 
35 the primer can contain one or mor , preferably two. 
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additional nucleotides at its 3' end. Because, 
statistically, only a subset of the mRNA deriv d 
sequences present in the sample of interest will 
hybridize to such primers, the additional nucleotides 
5 allow the primers to amplify only a subset of the mRNA 
derived sequences present in the sample of interest. 
This is preferred in that it allows more accurate and 
complete visualization and characterization of each of 
the bands representing amplified sequences. 
10 The 5' primer can contain a nucleotide sequence 

expected, statistically, to have the ability to hybridize 
to cDNA sequences derived from the tissues of interest. 
The nucleotide sequence can be an arbitrary one, and the 
length of the 5' oligonucleotide primer can range from 
15 about 9 to about 15 nucleotides, with about 13 
nucleotides being preferred. 

Additionally, arbitrary primer sequences cause the 
lengths of the amplified partial cDNAs produced to be 
variable, thus allowing different clones to be separated 
20 by using standard denaturing sequencing gel 
electrophoresis . 

PCR reaction conditions should be chosen which 
optimize amplified product yield and specificity, and, 
additionally, produce amplified products of lengths which 
25 can be resolved utilizing standard gel electrophoresis 
techniques. Such reaction conditions are well known to 
those of skill in the art, and important reaction 
parameters include, for example, length and nucleotide 
sequence of oligonucleotide primers as discussed above, 
30 and annealing and elongation step temperatures and 
reaction times. 

The pattern of clones resulting from the revexrse 
transcription and amplification of the mRNA of two 
different cell types is displayed via sequencing gel 
35 electrophoresis and compar d. Differences in the two 
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banding patt rns indicate potentially differ ntially 

expressed genes. 

once potentially differentially expressed gene 
sequences have been identified via bulk techniques such 
5 as, for example, those described above, the differential 
expression of such putatively differentially expressed 
genes should be corroborated. Corroboration can be 
accomplished via, for example, such well-known techniques 
as Northern analysis, quantitative RT-coupled PCR or 

10 RNase protection. 

Upon corroboration, the differentially expressed 
genes can be further characterized, and can be identified 
as target and/or fingerprint genes, as discussed, below, 

in section 5.1.4. 

15 Also, amplified sequences of differentially 

expressed genes obtained through differential display can 
be used to isolate the full length clones of the 
corresponding gene. The full-length coding portion of 
the gene can readily be isolated, without undue 

2 0 experimentation, by molecular biological techniques well 
known in the art. For example, the isolated 
differentially expressed amplified fragment can be 
labeled and used to screen a cDNA library. 

Alternatively, the labeled fragment can be used to screen 

25 a genomic library. 

PCR technology can also be utilized to isoXate 
full-length cDNA sequences. As described in this section 
above, the isolated amplified gene fragments (of about at 
least 10 nucleotides, preferrably longer, of about 15 
nucleotides) obtained through differential display have 
their 5' terminal end at some random point within the 
gene and have 3' terminal ends at a position 
corresponding to the 3' end of the transcribed portion of 
the g ne. Once nucleotid sequence information from an 
amplified fragment is obtained, the r mainder of th 
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gene ( i.e. . the 5' end of the gene, wh n utilizing 
differential display) can be obtained using, for exampl , 
RT PCR. 

In one embodiment of such a procedure for the 
5 identification and cloning of full length gene sequences, 
RNA can be isolated, following standard procedures, from 
an appropriate tissue or cellular source. 

A reverse transcription reaction can then be 
performed on the RNA using an oligonucleotide primer 

10 complementary to the mRNA that corresponds to the 
amplified cloned fragment, for the priming of first 
strand synthesis. Because the primer is anti-parallel to 
the mRNA, extension will proceed toward the 5' end of the 
mRNA. The resulting RNA/DNA hybrid can then be "tailed" 

15 with guanines using a standard terminal transferase 

reaction, the hybrid can be digested with RNAase H, and 
second strand synthesis can then be primed with a poly-C 
primer. Using the two primers, the 5' portion of the 
gene is then amplified using PCR. Sequences obtained can 

20 then be isolated and recombined with previously isolated 
sequences to generate a full-length cONA of the 
differentially expressed genes of the invention. For a 
review of cloning strategies and recombinant DNA 
techniques which can be used, see, e.g. . Sambrook et al. . 

25 1989, Molecular Cloning, A Laboratory Manual, Cold 
Springs Harbor Press, N.Y.; and Ausubel et al. f 1989, 
Current Protocols in Molecular Biology, (Green Publishing 
Associates and Wiley Interscience, N.Y.). 



5.2. METHODS FOR TEE IDENTIFICATION 
30 OF PATHWAY GENES 

Methods are described herein for the 

identification of pathway genes. "Pathway gene," as used 

herein, refers to a gene whose gene product exhibits the 

ability to interact with gene products involv d in tumor 

35 progression. A pathway gene can be differ ntially 
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expr ssed and, therefor , can have the charact ristics of 
a target and/ or fingerprint gene. 

Any method suitable for detecting protein-protein 
interactions can be employed for identifying pathway gene 
5 products by identifying interactions between gene 

products and gene products known to be involved in tumor 
progression and tumor progression disorders, including 
metastatic disorders. Such known gene products can be 
cellular or extracellular proteins. Those gene products 
10 which interact with such known gene products represent 
pathway gene products and the genes which encode them 
represent pathway genes. 

Among the traditional methods which can be 
employed are co-immunoprecipitation, cross-linking and 
15 co-purification through gradients or chromatographic 
columns. Utilizing procedures such as these allows for 
the identification of pathway gene products. Once 
identified, a pathway gene product can be used, in 
conjunction with standard techniques, to identify its 
20 corresponding pathway gene. For example, at least a 
portion of the amino acid sequence of the pathway gene 
product can be ascertained using techniques well known to 
those of skill in the art, such as via the Edman 
degradation technique (see, e,q, , Creighton, 1983, 
25 Proteins: Structures and Molecular Principles, W.H. 

Freeman & Co., N.Y., pp. 34-49). The amino acid sequence 
obtained can be used as a guide for the generation of 
oligonucleotide mixtures that can be used to screen for 
pathway gene sequences. Screening can be accomplished, 
30 for example by standard hybridization or PCR techniques. 
Techniques for the generation of oligonucleotide mixtures 
and the screening are well known. (See, Ausubel, 
RUP T-a. . and PCR Protocols: A Guide to Methods and 
Applications, 1990, Innis, M. et fll., eds. Acad mic 
35 Press, Inc. , N w York) . 
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Additionally, methods can be employed which r suit 
in the simultaneous identification £ pathway genes which 
encode the protein interacting with a protein involved in 
tumor progression and tumor progression disorders, 
5 including metastatic diseases. These methods include, 
for example, probing expression libraries with labeled 
protein known or suggested to be involved in metastatic 
diseases using this protein in a manner similar to the 
well known technique of antibody probing of Xgtll 

10 libraries ♦ 

One method which detects protein interactions in 
vivo - the yeast two-hybrid system, is described in detail 
for illustration only and not by way of limitation. One 
version of this system has been described (Chien et al. . 

15 1991, Proc. Natl. Acad. Sci. USA, 9578-9582) and is 
commercially available from Clontech (Palo Alto, CA) . 

Briefly, utilizing such a system, plasmids are 
constructed that encode two hybrid proteins: the first 
hybrid protein consists of the DNA-binding domain of a 

20 transcription factor ( e.g. . activation protein) fused to 
a known protein, in this case, a protein known to be 
involved in tumor progression, and the second hybrid 
protein consists of the transcription factor's activation 
domain fused to an unknown protein that is encoded by a 

25 cDNA which has been recombined into this plasmid as part 
of a cDHA library. The plasmids are transformed into a 
strain of the yeast Saccheuromyces cerevisiae that 
contains a reporter gene f e.o, . lacZ ) whose expression is 
regulated by the transcription factor's binding site. 

30 Either hybrid protein alone cannot activate transcription 
of the reporter gene. The DNA binding hybrid protein 
cannot activate transcription because it does not provide 
the activation domain function and the activation domain 
hybrid protein cannot activate transcription because it 

35 lacks the domain required for binding to its target site 
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{ ti .a. . it cannot localize to th transcription activator 
protein's binding site). Interaction between the DNA 
binding hybrid protein and the library encoded protein 
reconstitutes the functional transcription factor and 
results in expression of the reporter gene, which is 
detected by an assay for the reporter gene product. 

The two-hybrid system or related methodology can 
be used to screen activation domain libraries for 
proteins that interact with a Known "bait" gene product. 
By way of example, and not by way of limitation, gene 
products ( e.g. . 030 gene products) known to be involved 
in tumor progression and tumor progression disorders, 
such as metastatic diseases, can be used as the bait gene 
products. Total genomic or cDNA sequences are fused to 
the DNA encoding an activation domain. This library and 
a plasmid encoding a hybrid of the bait gene product 
fused to the DNA-binding domain are cotransf ormed into a 
yeast reporter strain, and the resulting transf ormants 
are screened for those that express the reporter gene. 
For example, and not by way of limitation, the bait gene 
can be cloned into a vector such that it is 
translationally fused to the DNA encoding the DNA-binding 
domain of the GAL4 protein. The colonies are purified 
and the (library) plasmids responsible for reporter gene 
25 expression are isolated. The inserts in the plasmids are 
sequenced to identify the proteins encoded by the cDNA or 

genomic DNA. 

A cDNA library of a cell or tissue source which 
expresses proteins predicted to interact with the bait 

30 gene product can be made using methods routinely 

practiced in the art. According to the particular system 
described herein, the library is generated by inserting 
the cDNA fragments into a vector such that they are 
translationally fused to the activation domain of GAL4 . 

35 This library can be co-transform d along with th bait 
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gene-GAL4 fusion plasmid into a y ast strain which 
contains a lacZ gene whose expression is controlled by a 
promoter which contains a GAL4 activation sequence* A 
cDNA encoded protein, fused to GAL4 activation domain, 
5 that interacts with the bait gene product will 

reconstitute an active GAL4 transcription factor and 
thereby drive expression of the lacZ gene. Colonies 
which express lacZ can be detected by their blue color in 
the presence of X-gal. cDNA containing plasmids from 

10 such a blue colony can then be purified and used to 
produce and isolate the bait gene product interacting 
protein using techniques routinely practiced in the art. 

Once a pathway gene has been identified and 
isolated, it can be further characterized as, for 

15 example, discussed below, in Section 5.3. 

5.3. CHARACTERIZATION OF DIFFERENTIALLY 
EXPRESSED AND PATHWAY GENES 

Differentially expressed genes, such as those 
identified via the methods discussed, above, in Section 

20 5.1, and pathway genes, such as those identified via the 
methods discussed, above, in Section 5.2, above, as well 
as genes identified by alternative means, can be further 
characterized by utilizing, for example, methods such as 
those discussed herein. Such genes will be referred to 

25 herein as "identified genes. " 

Analyses such as those described herein, yield 
information regarding the biological function of the 
identified genes. An assessment of the biological 
function of the differentially expressed genes, in 

3 0 addition, will allow for their designation as target 
and/or fingerprint genes. 

Specifically, any of the differentially expressed 
genes whose further characterization indicates that a 
modulation of the gene's expression or a modulation of 

3 5 the gene product's activity can inhibit tumor progression 
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will be d signaled "target genes, " as defined, above, in 
Section 5.1. Such target genes and target gene products, 
along with those discussed below, will constitute the 
focus of the compound discovery strategies discussed, 
below, in Section 5.8. Further, such target genes, 
target gene products and/or modulating compounds can be 
used as part of the tumor progression disorder treatment 
methods described, below, in Section 5.9. 

Any of the differentially expressed genes whose 
further characterization indicates that such modulations 
does not positively affect tumor progression, but whose 
expression pattern contributes to a gene expression 
"fingerprint" pattern correlative of, for example, tumor 
progression will be designated a -fingerprint gene." 
15 "Fingerprint patterns" will be more fully discussed, 

below, in Section 5.11.1. It should be noted that each 
of the target genes can also function as fingerprint 
genes, as can all or a portion of the pathway genes. 

It should further be noted that the pathway genes 
can also be characterized according to techniques such as 
those described herein. Those pathway genes which yield 
information indicating that they are differentially 
expressed and that modulation of the gene's expression or 
a modulation of the gene product's activity can inhibit 
tumor progression or ameliorate tumor progression- 
associated symptoms will also be designated "target 
genes." Such target genes and target gene products, 
along with those discussed above, will constitute the 
focus of the compound discovery strategies discussed, 
30 below, in section 5.8 and can be used as part of the 
treatment methods described in Section 5.9, below. 

It should be additionally noted that the 
characterization of one or more of the pathway genes can 
r v al a laclc of differential expression, but vidence 
35 that modulation of the gene's activity or expression can, 



20 



25 
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nonetheless , ameliorate symptoms of tumor progression. 
In such cases f these genes and gene products would also 
be considered a focus of the compound discovery 
strategies of Section 5.8, below and can be used as part 
5 of the treatment methods described in Section 5.9 , below. 
In instances wherein a pathway gene's 
characterization indicates that modulation of gene 
expression or gene product activity cannot retard the 
tumor progression diseases of interest , but is 
10 differentially expressed and contributes to a gene 
expression fingerprint pattern correlative of, tumor 
progression states or disorders , such as metastatic 
diseases , such pathway genes can additionally be 
designated as fingerprint genes. 
15 A variety of techniques can be utilized to further 

characterize the identified genes. First, the nucleotide 
sequence of the identified genes, which can be obtained 
by utilizing standard techniques well known to those of 
skill in the art, can be used to further characterize 
20 such genes. For example, the sequence of the identified 
genes can reveal homologies to one or more known sequence 
motifs which can yield information regarding the 
biological function of the identified gene product. 

Second, an analysis of the tissue and/ or cell type 
25 distribution of the mRNA produced by the identified genes 
can be conducted, utilizing standard techniques well 
known to those of skill in the art. Such techniques can 
include, for example, Northern analyses, RT-coupled PCR 
and RNase protection techniques. Such analyses provide 
30 information as to whether the identified genes are 
expressed in tissues expected to contribute to tumor 
progression. Such analyses can also provide quantitative 
information regarding steady state mRNA regulation, 
yielding data c ncerning which of the identif i d genes 
35 exhibits a high level of r gulation in, preferably, 
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tissu s which can be expected to contribute to tumor 
progression. Additionally, standard In £i£u 
hybridization techniques can be utilized to provide 
information regarding which cells within a given tissue 
5 express the identified gene. Such an analysis can 

provide information regarding the biological function of 
an identified gene relative to given tumor progression in 
instances wherein only a subset of the cells within the 
tissue is thought to be relevant to the disorder. 
10 Third, the sequences of the identified genes can 

be used, utilizing standard techniques, to place the 
genes onto genetic maps, e T g, , mouse (Copeland, N.G. and 
Jenkins, N.A. , 1991, Trends in Genetics 1:113-118) and 
human genetic maps (Cohen, D., ejL_aJU, 1993, Nature 
15 23JL' 698-701) . Such mapping information can yield 

information regarding the genes' importance to human 
disease by, for example, identifying genes which map 
within genetic regions to which known genetic tumor 
progression disorders map. 
20 Fourth, the biological function of the identified 

genes can be more directly assessed by utilizing relevant 
in vivo and in vitro systems. In v±VQ systems can 
include, but are not limited to, animal systems which 
naturally exhibit symptoms of tumor progression, such as 
25 metastatic disease, or ones which have been engineered to 
exhibit such symptoms. For example, tumor progression 
animal models may be generated by injecting animals, such 
as mice, with tumor cells, some of which will give rise 
to tumors within the injected animals. Among the cells 
30 which may be utilized for such a purpose are cells 

listed, above, in Section 5.1.1.1, such as the B16 cell 
variants. 

The role of identified gene products i&^j., 030 
gen products) can be determined by transfecting cDNAs 
35 encoding th se gene products into appropriate cell lin s, 
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such as, for example, a B16 cell lin variant, and 
analyzing the effect on tumor progr ssi n 
characteristics. For example, the role/ function of genes 
important in the progression of human colorectal cancers 
5 are assessed using the KM12c (low metastatic potential) 
and KM12L4 (highly metastatic) cells implanted into nude 
mice spleens and the number of hepatic tumors that 
develop are determined. The function of genes isolated 
using human colorectal tumors and their hepatic 
10 metastases are assessed by expressing the gene in the 
appropriate KM12 variant. Additionally, the 
role/ function of genes important in the progression of 
prostatic and breast cancers are assessed using 
appropriate cell lines described above in Section 
15 5.1.1.1. importantly, the role/function of genes 

important in the progression of melanoma, colon, prostate 
and breast cancers in humans are assessed using biopsy 
specimens from patients having undergone surgical 
treatment, as described in Section 5.1.1.1. above. 
20 Further, such systems can include, but are not 

limited to transgenic animal systems such as those 
described, above, in Section 5.7.1 below. In vi*V<? 
systems can include, but are not limited to, cell-based 
systems comprising cell types known or suspected of 
25 contributing to tumor progression. Such cells can be 

wild type cells, or can be non-wild type cells containing 
modifications known to or suspected of, contributing to 
tumor progression. Such systems are discussed in detail, 
below, in Section 5.7.2. The procedure to identify and 
3 0 isolate the human homologue of the fomy030 gene is 
described, below, in Section 5.7.3. 

In further characterizing the biological function 
of the identified genes, the expression of these genes 
can be modulated within th in vivo and/ or In v i tro 
35 systems, i.e. . eith r over- or under-expr ss d, and th 
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subsequ nt eff ct on the system then assayed. 
Alternative ly, the activity of the product f the 
identified gene can be modulated by either increasing or 
decreasing the level of activity in the in vivo and/or in 
5 vitro system of interest, and its subsequent effect then 
assayed . 

The information obtained through such 
characterizations can suggest relevant methods for the 
treatment of tumor progression and tumor progression 

10 disorders involving the gene of interest. Further, 
relevant methods for controlling the spread of tumor 
cells involving the gene of interest can be suggested by 
information obtained from such characterization. For 
example, treatment can include a modulation of gene 

15 expression and/or gene product activity. 

Characterization procedures such as those described 
herein can indicate where such modulation should involve 
an increase or a decrease in the expression or activity 
of the gene or gene product of interest. Such methods of 

20 treatment are discussed, below, in Section 5.9. 

5.4. DIFFERENTIALLY EXPRESSED 
AND PATHWAY GENES 

Differentially expressed genes, such as those 

identified in Section 5.1.1, above, and pathway genes, 

25 such as those identified in Section 5.2, above, are 
described herein. 

The differentially expressed and pathway genes of 
the invention are listed below, in Table 1. The 
nucleotide sequence for the differentially expres&ed 

30 fomy030 gene is shown in FIGS. 2 and 3 A and 3B. 

Specifically, FIG. 2 depicts the nucleotide sequence (SEQ 
ID NO:l) of the amplified cDNA band initially identified 
via differential display analysis, which is ref rxred t 
herein as romy030. FIGS. 3A AMD 3B depict the nucl otide 

35 sequ nc (SEQ ID NO: 2) of a fomy030 cDNA clone which was 
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isolated using a romy030 probe. The deduced amino acid 
s quence also is shown in FIGS. 3A and 3B (SEQ ID NO: 3)* 
FIG. 5 shows the nucleotide (SEQ ID NO: 6) and deduced 
amino acid sequences (SEQ ID NO: 7) of a fohy030 cDNA 
5 clone which was isolated using the entire mouse fomy030 
cDNA as a probe. FIG. 6 shows an alternative splice form 
of fohy030 (SEQ ID N0s:8 and 9). 

Table 1 summarizes information regarding the 
further characterization of the differentially expressed 

10 fomy030 gene of the invention. Table 2 lists £ r QQli 
clones, deposited with the Agricultural Research Service 
Culture Collection (NRRL) , which contain sequences found 
within the genes of Table 1. 

In Table 1, the paradigm used initially to detect 

15 the differentially expressed gene is described under the 
column headed "Paradigm of Original Detection." In this 
column, "l w indicates that gene expression is higher 
f i.e. . there is a greater steady state amount of 
detectable mRNA produced by a given gene) in the 

20 indicated cell type relative to the other cell type, 

while M I M indicates that gene expression is lower (jr£t , 
there is a lower steady state amount of detectable ulRNA, 
produced by a given gene) in the indicated cell type 
relative to the other cell type. As indicated under this 

25 column, the 030 gene was initially identified via a 
differential screen between B16 Fl (low metastatic 
potential cells) and B16 F10 (high metastatic potential 
cells) in which 030 gene expression is lower in the high 
metastatic potential B16 F10 cell line than in the low 

30 metastatic potential B16 Fl cell line. 

The Table 1 column headed "Paradigm Expression 
Pattern" lists the cell type in which gene expression was 
initially detected. In the case of the 030 gene, gene 
expression was first detected in melanoma (ye. . B16) 

35 c lis. "Detectabl " as used herein , refers to levels of 
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mRNA which =re d tectable, via standard diff -ntial 
display, Northern, RT-coupled PC* and/or R*a.e protection 
technique, which are veil Known to those of sKill in the 



art:. 



10 



15 



20 



Cell types in which differential expression was 
detected are summarized in Table 1 under the column 
headed "Cell Type Detected in." In the case of the 030 
gene, expression has additionally been detected withxn 

melanocyte cells. 

Additionally, in instances wherein the genes 
contain nucleotide sequences similar or homologous to 
sequences found in nucleic acid databases, references to 
such similarities are listed. Because the 030 gene is a 
novel gene. l^.. no homologous gene sequences are 
present in published databases, no such reference is 

liBted ' Finally, nucleotide sequences contained witbin the 
differentially expressed genes are listed in the Figure, 
indicated under the heading "Seq." In the case of the 
fomy030 gene, such sequences are listed in FIGS. 2 and 3A 
and 3B. and for fohyOJO, in FIGS. 5 and 6. 

The genes listed in Table 1 can be obtain*, using 
cloning method, well Known to those .Killed in the art, 
including, but not limited to, the us. of 
probes to detect the genes within an appropriate cDHA or 
,0HA (genomic DNA) library. (See. for example, sa-brooX 
LaU, 1989, Molecular cloning: A Laboratory Manual, 
coring Barber Laboratories,. Probe, for the novel 
sequences reported herein can be obtained directly fro. 
> ttTisolated clone, deposited with the NFPX, as indicated 
In Table 2, below. Alternatively, oligonucleotid. probe. 
£r the novel gene, can be synthesized, using techniques 
well Known to those of sKill in the art, based on the DNA 
sequenc s disclos d her in in FIGS. 2, 3A, 5, and 6. 
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Th probes can be us d to sere n cDNA librari s 
prepared from an appropriate cell or cell line in which 
the gene is transcribed. For example, the genes 
described herein that were detected in melanocyte cells 
can be cloned from a cDNA library prepared from 
melanocyte cells such as, for example, melan-c 
(Hodgkinson, C.A., et al., 1993, Cell 21: 395-404) , the 
cDNA libraries developed from the human melanoma cell 
line A2058 (Clontech, Palo Alto, CA) and cDNA libraries 
developed from the murine melanoma cell line K1735 
(Stratagene, La Jolla, CA) . Genomic DNA libraries can be 
prepared from any source. 



15 



TABLE 1 

Differentially Expressed and Pa thway Genes 



GENE 



fomy030 



fohy030 



Sequence 
ID 



6 6 8 



Paradigm 
of 

Original 
Detection 
(t/l) 



B16 t Fl 
B16 I F10 



benign 
nevi t 
malignant 
melanoma I 



Paradigm 
Expreaeion 
Pattern 



melanoma 
cells 



biopsy 
samplet 



Cell Type 
Detected 
in 




melanocyte 



melanocyte 



Table 2, below, lists an E. coli strain as 
deposited with the NRRL, which contains an isolated 
20 plasmid fomy030 clone. The clone contains a fomy030 cDNA 
in a pBlueScript SK- (Stratagene, La Jolla, CA) vector 
which was isolated from a mouse melanocyte cDNA library 
screened with a romy030 probe, as described in Section 
6.2, below. 
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TABLE 2 



GENE 


STRAIN DEPOSITED 
WITH NRRL 


PLASMID CLONE CONTAINED B 
WTTHIN DEPOSITED STRAIH 1 


f omy030 


FOMY030 


pFOMY030 H 


fohyQ30 







As used herein, "dif f erentially expressed gene" 
r l.e. . target and fingerprint genes) or "pathway gene" 
refers to (a) a gene containing: at least one of the DNA 
sequences disclosed herein (as shown in FIGS. 2, 3A, 3B, 
10 5, and 6) or contained in the clones listed in Table 2, 
as deposited with the NRRL; (b) any DNA sequence that 
encodes the amino acid sequence encoded by: the DMA 
sequences disclosed herein (as shown in FIGS. 2, 3A, 3B, 
5, and 6), contained in the clones, listed in Table 2, as 
15 deposited with the NRRL or contained within the coding 
region of the gene to which the DNA sequences disclosed 
herein (as shown in FIGS. 2, 3A, 3B, 5, and 6) or 
contained in the clones listed in Table 2, as deposited 
with the NRRL, belong; (c) any DNA sequence that 
20 hybridizes to the complement of: the coding sequences 
disclosed herein (as shown in FIGS 2, 3A, 3B, 5, and 6), 
contained in clones listed in Table 2, as deposited with 
the NRRL, or contained within the coding region of the 
gene to which the DNA sequences disclosed herein (as 
25 shown in FIGS. 2, 3A, 3B, 5, and 6) or contained in the 
clones listed in Table 2, as deposited with the NRRL, 
belong under highly stringent conditions, e^gj., 
hybridization to filter-bound DNA in 0.5 M NaHP0 4 , 7% 
sodium dodecyl sulfate (SDS) , l mM EDTA at 65°C, and 
30 washing in 0.lxSSC/0.1% SDS at 68°C (Ausubel F.M. et_aU, 
eds., 1989, current Protocols in Molecular Biology, Vol. 
I, Green Publishing Associates, Inc., and John Wiley & 
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sons, inc., New York, at p. 2.10.3) and encod s a gen 
product functionally equivalent to a gene product encoded 
by a gene of (a) , above and/or (d) any DMA sequence that 
hybridizes to the complement of: the coding sequences 
disclosed herein, (as shown in FIGS. 2, 3A, 3B, 5, and 6) 
contained in the clones listed in Table 2, as deposited 
with the NRRL or contained within the coding region of 
the gene to which DNA sequences disclosed herein (as 
shown in FIGS. 2, 3A, 3B, 5, and 6) or contained in the 
clones, listed in Table 2, as deposited with the NRRL, 
belong under less stringent conditions, such as 
moderately stringent conditions, e t q t , washing in 
0.2XSSC/0.1% SDS at 42°C (Ausubel et al t , 1989, supxa) , 
yet which still encodes a gene product functionally 
15 equivalent to a gene product encoded by a gene of (a) , 
above . 

The invention also includes nucleic acid 
molecules, preferably DNA molecules, that hybridize to, 
and are therefore the complements of, the DNA sequences 
20 (a) through (d) , in the preceding paragraph. Such 

hybridization conditions can be highly stringent or less 
highly stringent, as described above. In instances 
wherein the nucleic acid molecules are 
deoxyoligonucleotides ( M oligos M ) , highly stringent 
25 conditions can refer, e.g. . to washing in 6xSSC/0.05% 
sodium pyrophosphate at 37°C (for 14-base oligos) , 48°C 
(for 17-base oligos) , 55°C (for 20-base oligos) , and 60°C 
(for 23-base oligos) . These nucleic acid molecules can 
act as target gene antisense molecules, useful, for 
30 example, in target gene regulation and/ or as antisense 
primers in amplification reactions of target, 
fingerprint, and/or pathway gene nucleic acid sequences. 
Further, such sequences can be used as part of ribozyme 
and/or triple helix sequenc s, also useful for target 
35 g ne regulation. Still furth r, such molecul s can be 
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used as compon nts of diagnostic in thods wh reby tumor 
progression disorders can be detected. 

The invention also encompasses (a) DNA vectors 
that contain any of the foregoing coding sequences and/or 
their complements <i^, antisense) ; (b) DHA expression 
vectors that contain any of the foregoing coding 
sequences operatively associated with a regulatory 
element that directs the expression of the coding 
sequences; and (c) genetically engineered host cells that 
contain any of the foregoing coding sequences operatively 
associated with a regulatory element that directs the 
expression of the coding sequences in the host cell. As 
used herein, regulatory elements include but are not 
limited to inducible and non-inducible promoters, 
enhancers, operators and other elements known to those 
skilled in the art that drive and regulate expression. 
The invention includes fragments of any of the DNA 
sequences disclosed herein. 

in addition to the gene sequences described above, 
homologies of these gene sequences as can, for example be 
present in other species, preferably human in instances 
wherein the above described gene sequences are not human 
gene sequences, can be identified and can readily be 
isolated, without undue experimentation, by molecular 
biological techniques well known in the art. Further, 
there can exist genes at other genetic loci within the 
genome that encode proteins which have extensive homology 
to one or more domains of such gene products. These 
genes can also be identified via similar techniques. 
, For example, the isolated differentially expressed 

gene sequence can be labeled and used to screen a cDNA 
library constructed from mRNA obtained from the organism 
of interest. Hybridization conditions will be of a lower 
stringency when the cDNA library was derived from an 
5 organism different from the type f organism from which 
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the labeled sequence was derived. Alternativ ly, the 
labeled fragment can be used to screen a genomic library 
derived from the organism of interest, again, using 
appropriately stringent conditions. Such low stringency 
5 conditions will be well known to those of skill in the 
art, and will vary predictably depending on the specific 
organisms from which the library and the labeled 
sequences are derived. For guidance regarding such 
conditions see, for example, Sambrook et al. . 1989, 
10 Molecular Cloning, A Laboratory Manual, Cold Springs 
Harbor Press, N.Y. ; and Ausubel et al. f 1989, Current 
Protocols in Molecular Biology, (Green Publishing 
Associates and Wiley Interscience, N.Y.). 

Further, a previously unknown differentially 
15 expressed or pathway gene-type sequence can be isolated 
by performing PCR using two degenerate oligonucleotide 
primer pools designed on the basis of amino acid 
sequences within the gene of interest. The template for 
the reaction can be cDNA obtained by reverse 
20 transcription of mRNA prepared from human or non-human 
cell lines or tissue known or suspected to express a 
differentially expressed or pathway gene allele. The PCR 
product can be subcloned and sequenced to insure that the 
amplified sequences represent the sequences of a 
25 differentially expressed or pathway gene- like nucleic 
acid sequence. 

The PCR fragment can then be used to isolate a 
full length cDNA clone by a variety of methods. For 
example, the amplified fragment can be labeled and used 
30 to screen a bacteriophage cDNA library. Alternatively, 
the labeled fragment can be used to screen a genomic 
library. 

PCR technology can also be utilized to isolate 
full length cDNA sequenc s. For example, RNA can be 
35 isolated, following standard procedures, from an 
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appropriate cellular or tissue sourc . A reverse 
transcription reaction can be performed on the RNA usxng 
an oligonucleotide primer specific for the most 5' end of 
the amplified fragment for the priming of first strand 
5 synthesis. The resulting RNA/DNA hybrid can then be 
"tailed" with guanines using a standard terminal 
transferase reaction, the hybrid can be digested with 
RNAase H, and second strand synthesis can then be pnmed 
with a poly-C primer. Thus, cDNA sequences upstream of 
10 the amplified fragment can easily be isolated. For a 

review of cloning strategies which can be used, see e^., 
Sambrook ejL-fll^, 1989, Molecular Cloning, A Laboratory 
Manual, Cold springs Harbor Press, N.Y.; and Ausubel fit 
aj^, 1989, current Protocols in Molecular Biology, (Green 
15 Publishing Associates and Wiley Interscience, N.Y.). 

in cases where the differentially expressed or 
pathway gene identified is the normal, or wild type, 
gene, this gene can be used to isolate mutant alleles of 
the gene. Such an isolation is preferable in processes 
20 and disorders which are known or suspected to have a 
genetic basis. Mutant alleles can be isolated from 
individuals either known or suspected to have a genotype 
which contributes to tumor progression symptoms. Mutant 
alleles and mutant allele products can then be utilized 
25 in the therapeutic and diagnostic assay systems described 
below. 

A cDNA of a mutant gene can be isolated, for 
example, by using PCR, a technique which is well-known to 
one skilled in the art. In this case, the first cDNA 

30 strand can be synthesized by hybridizing a oligo-dT 
oligonucleotide to mRNA isolated from tissue known or 
suspected of being expressed in an individual putatively 
carrying the mutant allele, and by extending the new 
strand with reverse transcriptas . The second strand of 

35 the cDNA can th n be synthesiz d using an oligonucleotide 
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that hybridizes specifically to th 5'- nd of th normal 
g ne. Using th s two primers, th product is then 
amplified via PCR, cloned into a suitable vector, and 
subjected to DNA sequence analysis through methods well- 
5 known to one skilled in the art. By comparing the DNA 
sequence of the mutant gene to that of the normal gene, 
the mutation (s) responsible for the loss or alteration of 
function of the mutant gene product can be ascertained. 

Alternatively, a genomic or cDHA library can be 

10 constructed and screened using DMA or RNA, respectively, 
from a tissue known to or suspected of expressing the 
gene of interest in an individual suspected of or known 
to carry the mutant allele. The normal gene or any 
suitable fragment thereof can then be labeled and used as 

15 a probe to identify the corresponding mutant allele in 
the library. The clone containing this gene can then be 
purified through methods routinely practiced in the art, 
and subjected to sequence analysis as described, above, 
in this Section. 

20 Additionally, an expression library can be 

constructed utilizing DNA isolated from or cDNA 
synthesized from a tissue known to or suspected of 
expressing the gene of interest in an individual 
suspected of or known to carry the mutant allele. In 

25 this manner, gene products made by the putatively mutant 
tissue can be expressed and screened using standard 
antibody screening techniques in conjunction with 
antibodies raised against the normal gene product, as 
described, below, in Section 5.2.3. (For screening 

30 techniques, see, for example, Harlow, E. and Lane, eds., 
1988, "Antibodies: A Laboratory Manual, " Cold Spring 
Harbor Press, Cold Spring Harbor.) In cases where the 
mutation results in an expressed gene product with 
alter d function ( e.g. . as a result of a missense 

35 mutation) , a polyclonal set of antibodies ar likely to 
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cross-r act with the mutant gene product. Library clon s 
detected via their reaction with such labeled antibodi s 
can be purified and subjected to sequence analysis as 
described in this Section, above. 

5 Taking the fomy030 gene as an example, the fomyOSO 

human homolog can be isolated by a variety of methods. 
First, sequences found in a murine fomyOSO cDNA can be 
utilized as hybridization probes to detect human £ohy030 
sequences. This can be accomplished, for example, by 

10 probing Southern blots containing total human genomic DNA 
with a labelled fomy030 probe, once it is verified that 
the probe being utilized detects the human 030 gene, one 
of skill in the art can employ any of several routine 
approaches to isolate the human gene without undue 

15 experimentation. 

in one approach, appropriate human cDNA libraries 
can be screened. Such cDNA libraries can, for example, 
include human melanocyte, human retina and fetal human 
brain cDNA libraries. For example, panels of human 
20 melanoma cells (such as, for example, SK-MEL-2, ATCC 68- 
HTB; SK-MEL-5, ATCC 70-HTB; SK-MEL-2 8, ATCC 72-HTB; G- 
361 ATCC 1424-CRL; and/or HT-144 [63-HTB] cells) can be 
screened for 030 expression by, for example, Northern 
blot analysis. Upon detection of 030 transcript, cDNA 
25 libraries can be constructed from RNA isolated from the 
appropriate cell line, utilizing standard techniques well 
known to those of skill in the art. The human cDNA 
library can then be screened with a 030 probe in order to 
isolate a human romyOSO cDNA. As described below, this 
30 method was used to determine the human fohy030 cDNAs m 

FIGS. 5 and 6. 

Alternatively, a human total genomic DNA library 
can be screened using 030 probes. 030-positive clones 
can then be sequenced and, further, the intron/exon 
35 structure f the human 030 gene may b elucidated. Once 
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gen mic sequence is obtain d, oligonucl otide prim rs can 
be designed based on th s quence for use in the 
isolation, via, for example RT-coupled PCR, of human 030 
CDNA. 

5 The procedures described in these approaches are 

routine and have been described in detail in Sections 
5.1.1.2, 5.3 and 5.7.2. 

5.5. DIFFERENTIALLY EXPRESSED 
*WP PATHWAV PRODOGTS 

10 Differentially expressed and pathway gene products 

include those proteins encoded by the differentially 
expressed and pathway gene sequences described in Section 
5.2.1, above, as for example, the peptide listed in FIG. 
3. specifically, differentially expressed and pathway 

15 gene products can include differentially expressed and 
pathway gene polypeptides encoded by the differentially 
expressed and pathway gene sequences contained in the 
clones listed in Table 2, above, as deposited with the 
Nrrl, or contained in the coding regions of the genes to 

20 which DNA sequences disclosed herein (in FIGS. 3A, 3B, 5, 
and 6) or contained in the clones, listed in Table 2, as 
deposited with the NRRL, belong, for example. 

In addition, differentially expressed and pathway 
gene products can include proteins that represent 

25 functionally equivalent gene products. Such an 

equivalent differentially expressed or pathway gene 
product can contain deletions, additions or substitutions 
of amino acid residues within the amino acid sequence 
encoded by the differentially expressed or pathway gene 
30 sequences described, above, in Section 5.2.1, but which 
result in a silent change thus producing a functionally 
equivalent differentially expressed on pathway gene 
product. Amino acid substitutions can be made on the 
basis of similarity in polarity, charge, solubility. 
35 hydrophobicity, hydrophilicity, and/ r the amphipatlc 
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nature of the residu s involved. For example, nonp lar 
(hydrophobic) amino acids include alanine, leucine, 
isoleucine, valine, proline, phenylalanine, tryptophan, 
and methionine; polar neutral amino acids include 
glycine, serine, threonine, cysteine, tyrosine, 
asparagine, and glutamine; positively charged (basic) 
amino acids include arginine, lysine, and histidine; and 
negatively charged (acidic) amino acids include aspartic 
acid and glutamic acid. '.Functionally equivalent, as 
utilized herein, refers to either a protein capable 
exhibiting a substantially similar in yjta activity as 
the endogenous differentially expressed or pathway gene 
products encoded by the differentially expressed or 
pathway gene sequences described in Section 5.2.1, above. 
Alternatively, when utilized as part of assays such as 
those described, below, in Section 5.3, "functionally 
equivalent" can refer to peptides capable of interacting 
with other cellular or extracellular molecules in a 
manner substantially similar to the way in which the 
corresponding portion of the endogenous differentially 
expressed or pathway gene product would. 

The differentially expressed or pathway gene 
products can be produced by synthetic techniques or via 
recombinant DNA technology using techniques well known in 
; the art. Methods for preparing the differentially 
expressed or pathway gene polypeptides and peptides of 
the invention by expressing nucleic acid encoding 
differentially expressed or pathway gene sequences are 
described herein. Methods which are well known to those 
, skilled in the art can be used to construct expression 
vectors containing differentially expressed or pathway 
gene protein coding sequences and appropriate 
transcriptional/translational control signals. These 
methods includ , for exampl , ill Ylfccfi r combinant DNA 
5 techniqu s, synthetic techniqu s and in 
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recombination/genetic recombination. See, for example, 
the techniques described in Maniatis yfc flit , 1989 r 
Molecular Cloning A Laboratory Manual, Cold Spring Harbor 
Laboratory, N.Y., and Ausubel, 1989, 
5 Alternatively, RMA capable of encoding differentially 
expressed or pathway gene protein sequences can be 
chemically synthesized using, for example, synthesizers . 
See, for example, the techniques described in 
••Oligonucleotide Synthesis," 1984, Gait, M.J. ed. , IRL 

10 Press, Oxford. 

A variety of host-expression vector systems can be 
utilized to express the differentially expressed or 
pathway gene coding sequences of the invention. Such 
host-expression systems represent vehicles by which the 

15 coding sequences of interest can be produced and 

subsequently purified, but also represent cells which 
can, when transformed or transfected with the appropriate 
nucleotide coding sequences, exhibit the differentially 
expressed or pathway gene protein of the invention Jja 

20 situ . These include but are not limited to 

microorganisms such as bacteria (e.g. . E_r. £SUL# Eu. 
pubtilis ) transformed with recombinant bacteriophage DNA, 
plasmid DNA or cosmid DNA expression vectors containing 
differentially expressed or pathway gene protein coding 

25 sequences; yeast (e.g. . SftggharPffYCSS, Eifihia) 

transformed with recombinant yeast expression vectors 
containing the differentially expressed or pathway gene 
protein coding sequences; insect cell systems infected 
with recombinant virus expression vectors (e,q t , 

30 baculovirus) containing the differentially expressed or 
pathway gene protein coding sequences; plant cell systems 
infected with recombinant virus expression vectors ( giqt r 
cauliflower mosaic virus, CaMV; tobacco mosaic virus, 
TMV) or transformed with r combinant plasmid expression 

35 vect rs f e.q. . Ti plasmid) containing differentially 
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expressed or pathway gen protein coding s qu nces; or 
mammalian cell systems COS, CHO, BHK, 293, 3T3) 

harboring recombinant expression constructs containing 
promoters derived from the genome of mammalian cells 
5 t a.q. . metallothionein promoter) or from mammalian 

viruses (fi^, the adenovirus late promoter; the vaccinia 

virus 7.5K promoter). 

In bacterial systems, a number of expression 
vectors can be advantageously selected depending upon the 
10 use intended for the differentially expressed or pathway 
gene protein being expressed. For example, when a large 
quantity of such a protein is to be produced, for the 
generation of antibodies or to screen peptide libraries, 
for example, vectors which direct the expression of high 
15 levels of fusion protein products that are readily 

purified can be desirable. Such vectors include, but are 
not limited, to the ^ SSll expression vector pUR278 
(Ruther St_ai^, 1983, EMBO J. 2:1791), in which the 
differentially expressed or pathway gene protein coding 
20 sequence can be ligated individually into the vector in 
frame with the lacZ coding region so that a fusion 
protein is produced; pIN vectors (Inouye & Inouye, 1985, 
Nucleic Acids Res. 13:3101-3109; Van Heeke & Schuster, 
1989, J. Biol. Chem. 2£±: 5503-5509 ) ; and the like. pGEX 
25 vectors can also be used to express foreign polypeptides 
as fusion proteins with glutathione S-transferase (GST) . 
In general, such fusion proteins are soluble and can 
easily be purified from lysed cells by adsorption to 
glutathione-agarose beads followed by elution in the 
30 presence of free glutathione. The pGEX vectors are 
designed to include thrombin or factor Xa protease 
cleavage sites so that the cloned target gene protein can 
be released from the GST moiety. 

In an ins ct system, Autographa calif ornica 
35 nuclear polyhedrosis virus (AcNPV) is used as a vector to 
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express foreign genes. The virus grows in Spodoptera 
frugiperda cells. Th differentially express d or 
pathway gene coding sequence can be cloned individually 
into non-essential regions (for example the polyhedrin 
5 gene) of the virus and placed under control of an AcNPV 
promoter (for example, the polyhedrin promoter) . 
Successful insertion of differentially expressed or 
pathway gene coding sequence will result in inactivation 
of the polyhedrin gene and production of non-occluded 

10 recombinant virus ( i.e. . virus lacking the proteinaceous 
coat coded for by the polyhedrin gene) . These 
recombinant viruses are then used to infect Spodoptera 
frugiperda cells in which the inserted gene is expressed 
( e.g. , see Smith et al. > 1983, J. Viol. ££:584; Smith, 

15 U.S. Patent No. 4,215,051)* 

In mammalian host cells, a number of viral-based 
expression systems can be utilized. In cases where an 
adenovirus is used as an expression vector, the 
differentially expressed or pathway gene coding sequence 

20 of interest can be ligated to an adenovirus 

transcription/ translation control complex, e.g. . the 
late promoter and tripartite leader sequence. This 
chimeric gene can then be inserted in the adenovirus 
genome by in vitro or in vivo recombination. Insertion 

25 in a non-essential region of the viral genome (e.cu . 

region El or E3) will result in a recombinant virus that 
is viable and capable of expressing differentially 
expressed or pathway gene protein in infected hosts 
f e.q. . See Logan & Shenk, 1984, Proc. Natl. Acad. Sci. 

30 USA fii:3655-3659) . Specific initiation signals can also 
be required for efficient translation of inserted 
differentially expressed or pathway gene coding 
sequences • These signals include the ATG initiation 
codon and adjacent sequences. In cases where an entire 

35 identified gene, including its own initiation codon and 
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adjacent s quenc s, is ins rted into th appropriate 
expression vector, no additional translational control 
signals can be needed. However , in cases vhere only a 
portion of the identified coding sequence is inserted, 
5 exogenous translational control signals, including, 
perhaps, the ATG initiation codon, must be provided. 
Furthermore, the initiation codon must be in phase with 
the reading frame of the desired coding sequence to 
ensure translation of the entire insert. These exogenous 

10 translational control signals and initiation codons can 
be of a variety of origins, both natural and synthetic. 
The efficiency of expression can be enhanced by the 
inclusion of appropriate transcription enhancer elements, 
transcription terminators, etc., (see Bittner et al. , 

15 1987, Methods in Enzymol. 151:516-544). 

In addition, a host cell strain can be chosen 
which modulates the expression of the inserted sequences, 
or modifies and processes the gene product in the 
specific fashion desired. Such modifications ( e.g. . 

20 glycosylation) and processing ( e.g. . cleavage) of protein 
products can be important for the function of the 
protein. Different host cells have characteristic and 
specific mechanisms for the post-translational processing 
and modification of proteins. Appropriate cell lines or 

25 host systems can be chosen to ensure the correct 
modification and processing of the foreign protein 
expressed. To this end, eukaryotic host cells which 
possess the cellular machinery for proper processing of 
the primary transcript, glycosylation, and 

30 phosphorylation of the gene product can be used. Such 
mammalian host cells include but are not limited to CHO, 
VERO, BHK, HeLa, COS, MDCK, 293, 3T3, WI38, etc. 

For long-term, high-yield production of recombi- 
nant proteins, stable expression is preferred. For 

35 exampl , c 11 lines which stably express th 
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diff rentially expressed or pathway gene protein can b 
engineered. Rather than using expression v ctors which 
contain viral origins of replication, host cells can be 
transformed with DNA controlled by appropriate expression 
5 control elements ( e.g. . promoter, enhancer, sequences, 
transcription terminators, polyadenylation sites, etc.), 
and a selectable marker* Following the introduction of 
the foreign DNA, engineered cells can be allowed to grow 
for 1-2 days in an enriched media, and then are switched 

10 to a selective media. The selectable marker in the 

recombinant plasmid confers resistance to the selection 
and allows cells to stably integrate the plasmid into 
their chromosomes and grow to form foci which in turn can 
be cloned and expanded into cell lines. This method can 

15 advantageously be used to engineer cell lines which 
express the identified gene protein. Such engineered 
cell lines can be particularly useful in screening and 
evaluation of compounds that affect the endogenous 
activity of the differentially expressed or pathway gene 

20 protein. 

A number of selection systems can be used, 
including, but not limited to, the herpes simplex virus 
thymidine kinase (Wigler, et al.. 1977, Cell 11:223), 
hypoxanthine-guanine phosphor ibosyltransf erase (Szybalska 

25 & Szybalski, 1962, Proc. Natl. Acad. Sci. USA££:2026), 
and adenine phosphor ibosyltransf erase (Lowy, et air* 
1980, Cell 22.: 817) genes in tk", hgprt" or aprt" cells, 
respectively. Also, antimetabolite resistance can be 
used as the basis of selection for dhfr, which confers 

30 resistance to methotrexate (Wigler, al, , 1980, Natl. 
Acad. Sci. USA 77:3567; O'Hare, et al.. 1981, Proc. Natl. 
Acad. Sci. USA 2£:1527); gpt, which confers resistance to 
mycophenolic acid (Mulligan & Berg, 1981, Proc. Natl. 
Acad. sci. USA 28 5 2072); neo, which confers resistance to 

35 the aminoglycosid G-418 (Colberre-Garapin, et al. . 1981, 
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j. Mol. Biol. jJL£;l>; and hygro, which confers r sistanc 
to hygromycin (Santerre, sLai,, 1984, Gene 2fi:l47) 
genes . 

An alternative fusion protein system allows for 
the ready purification of non-denatured fusion proteins 
expressed in human cell lines (Janknecht, et ^91, 
Proc. Natl. Acad. Soi. USA fiB.: 8972-8976) . in this 
system, the gene of interest is subcloned into a vaccinia 
recombination plasmid such that the gene's open reading 
frame is translationally fused to an amino-terminal tag 
consisting of six histidine residues. Extracts from 
cells infected with recombinant vaccinia virus are loaded 
onto ni2+ nitriloacetic acid-agarose columns and 
histidine-tagged proteins are selectively eluted with 
15 imidazole-containing buffers. 

When used as a component in assay systems such as 
that described herein, the differentially expressed or 
pathway gene protein can be labeled, either directly or 
indirectly, to facilitate detection of a complex formed 
20 between the differentially expressed or pathway gene 
protein and a test substance. Any of a variety of 
suitable labeling systems can be used including but not 
limited to radioisotopes such as 12S I; enzyme labelling 
systems that generate a detectable colorimetric signal or 
25 light when exposed to substrate; and fluorescent labels. 
Where recombinant DNA technology is used to 
produce the differentially expressed or pathway gene 
protein for such assay systems, it can be advantageous to 
engineer fusion proteins that can facilitate labeling, 
30 solubility, immobilization and/or detection. 

indirect labeling involves the use of a third 
protein, such as a labeled antibody, which specifically 
binds to either a differentially expressed or pathway 
gene product. Such antibodies includ but are not 
35 limited to polyclonal, monoclonal, chimeric, single 
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chain, Fab fragments and fragments produced by a Fab 
expression library. 

5.6. ANTIBODIES SPECIFIC FOR 
DIFFERENTIALLY EXPRESSED 
5 OR PATHWAY QEKE PRODUCTS 

Described herein are methods for the production of 
antibodies capable of specifically recognizing one or 
more differentially expressed or pathway gene epitopes. 
Such antibodies can include, but are not limited to 

10 polyclonal antibodies, monoclonal antibodies (mAbs) , 
humanized or chimeric antibodies, single chain 
antibodies, Fab fragments, F(ab') 2 fragments, fragments 
produced by a FAb expression library, anti-idiotypic 
(anti-Id) antibodies, and epitope-binding fragments of 

15 any of the above. Such antibodies can be used, for 

example, in the detection of a fingerprint, target, or 
pathway gene in a biological sample, or, alternatively, 
as a method for the inhibition of abnormal target gene 
activity. Thus, such antibodies can be utilized as tumor 

20 progression treatment methods, and/or can be used as part 
of diagnostic techniques whereby patients can be tested 
for abnormal levels of fingerprint, target, or pathway 
gene proteins, or for the presence of abnormal forms of 
the such proteins. 

25 For the production of antibodies to a 

differentially expressed or pathway gene, various host 
animals can be immunized by injection with a 
differentially expressed or pathway gene protein, or a 
portion thereof . Such host animals can include but are 

30 not limited to rabbits, mice, and rats, to name but a 
few. Various adjuvants can be used to increase the 
immunological response, depending on the host species, 
including but not limited to Freund's (complete and 
incomplete) , mineral g Is such as aluminum hydroxide , 

35 surface active substances such as lysolecithin, pluronic 
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polyols, polyanions, p ptid s, oil emulsions, H yhole 
limpet hemocyanin, dinitrophenol , and potentially useful 
human adjuvants such as BCG (bacille Calmette-Guerin) and 
Corynebacterium paxvum. 
5 Polyclonal antibodies are heterogeneous 

populations of antibody molecules derived from the sera 
of animals immunized with an antigen, such as target gene 
product (fi^, protein encoded by 030) , or an antigenic 
functional derivative thereof. For the production of 
10 polyclonal antibodies, host animals such as those 
described above, can be immunized by injection with 
differentially expressed or pathway gene product (s^., 
030) supplemented with adjuvants as also described above. 
Monoclonal antibodies, which are homogeneous 
15 populations of antibodies to a particular antigen, can be 
obtained by any technique which provides for the 
production of antibody molecules by continuous cell lines 
in culture. These include, but are not limited to the 
hybridoma technique of Kohler and Milstein, (1975, Nature 
20 2£6.: 495-497; and U.S. Patent No. 4,376,110), the human B- 
cell hybridoma technique (Kosbor at al,,, 1983, Immunology 
Today 4:72; Cole £fc_aJLu, 1983, Proc. Natl. Acad. Sci. USA 
8_p_: 2026-2030) , and the BV-hybridoma technique (Cole sSk 
al . . 1985 , Monoclonal Antibodies And Cancer Therapy, Alan 
25 R. Liss, inc., pp. 77-96). Such antibodies can be of any 
immunoglobulin class including igG, igM, igE, IgA, igD 
and any subclass thereof. The hybridoma producing the 
mAb of this invention can be cultivated in YltT2 °* in 
vivo , production of high titers of mAbs In YiXS. makes 
30 this the presently preferred method of production. 

In addition, techniques developed for the 
production of "chimeric antibodies" (Morrison et al. > 
1984, Proc. Natl. Acad. Sci., fil: 6851-6855; Neuberger fit 
aJU., 1984, Natur , 212.- 604-608; Takeda et fll, , 198 5, 
35 Nature, £l±:452-454; U.S. Pat nt No. 4,816,567) by 
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splicing the genes from a nous antibody molecul of 
appropriat antigen specificity together with gen s from 
a human antibody molecule of appropriate biological 
activity can be used. A chimeric antibody is a molecule 
5 in which different portions are derived from different 
animal species, such as those having a variable region 
derived from a murine mAb and a human immunoglobulin 
constant region. 

Alternatively, techniques described for the 

10 production of single chain antibodies (U.S. Patent 

4,946,778; Bird, 1988, Science 212:423-426; Huston sSL 
al. , 1988, Proc. Natl. Acad. Sci. USA 85:5879-5883; and 
Ward et al . . 1989, Nature 334 :544-546) and for making 
humanized monoclonal antibodies (U.S. Patent No. 

15 5,225,539) can be utilized to produce anti-dif f erentially 
expressed or anti-pathway gene product antibodies. 

Antibody fragments which recognize specific 
epitopes can be generated by known techniques. For 
example, such fragments include but are not limited to: 

20 the F(ab') 2 fragments which can be produced by pepsin 

digestion of the antibody molecule and the Fab fragments 
which can be generated by reducing the disulfide bridges 
of the F(ab') 2 fragments. Alternatively, Fab expression 
libraries can be constructed (Huse et al. . 1989, Science, 

25 246 :1275-1281) to allow rapid and easy identification of 
monoclonal Fab fragments with the desired specificity. 

5.7. CELL- AND ANIMAL -BAB ED MODEL SYSTEMS 
Described herein are cell- and animal-based 
systems which represent reliable models for tumor 
30 progression disorders. These systems can be used in a 
variety of applications. For example, the cell- and 
animal-based model systems can be used to identify 
differentially xpressed genes via the paradigms 
describ d, above, in Section 5.1.1.1. Such systems can 
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als be us d t furth r charact rize differ ntially 
expressed and pathway genes, as described, above, in 
Section 5.3. Such further characterization can, for 
example, indicate that a differentially expressed gene is 
5 a target gene, for example. Additionally, such assays 
can be utilized as part of screening strategies designed 
to identify compounds which are capable of preventing 
and/ or ameliorating symptoms of tumor progression 
disorders, including those associated with metastatic 
10 diseases, as described, below. Thus, the animal- and 
cell-based models can be used to identify drugs, 
pharmaceuticals, therapies and interventions which can be 
effective in treating tumor progression disorders, such 
as, for example, metastatic diseases. In addition, as 
15 described in detail, below, in Section 5.10.1, such 
animal models can be used to determine the LD 50 and the 
ED 50 in animal subjects, and such data can be used to 
determine the in yiyo. efficacy of potential anti-tumor 
progression disorder treatments. 

20 5.7.1. tNTMMi-BRBED SYSTEM 8 

Animal-based model systems of tumor progression 
disorders can be both non-recombinant animals as well as 
recombinantly engineered transgenic animals. 

Non-recombinant animal models for tumor 
25 progression can include, for example, murine models of 
melanoma, prostate cancer and colon cancer. Such models 
may be generated, for example, by introducing tumor cells 
into syngeneic mice using techniques such as subcutaneous 
injection, tail vein injection, spleen implantation, 
30 intraperitoneal implantation, implantation under the 
renal capsule or orthotopic implantation (fixfl. , colon 
cancer cells implanted in colonic tissue or prostatic 
cane r cells implanted in prostate gland) . After an 
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appr priate period of time, the tumors which result from 
these injections can be c unt d and analyzed. 

Among the cells which may be used for the 
production of such animal models of tumor progression are 
5 cells derived from the cell lines listed, above, in 
Section 5, 1.1*1. For example, B16 melanoma cells 
(Fidler, I.J., 1973, Nature New Biol. 242:148-149), 
including cell variants exhibiting high ( e.g. , B16 F10 
cells) and low f e.cr. . B16 Fl cells) metastatic potential 

10 may be utilized. Post-in ject ion f pulmonary tumors 
generally develop in the mouse models. Thus, these 
animal serve as models of not only melanoma tumor 
progression but also as models of pulmonary metastases. 

For the generation of animal models of colorectal 

15 cancers, colon cancer cells such as, for example, KM12c 
(low metastatic potential) and KM12L4 (highly metastatic) 
cells (Morikawa, K. et al. , 1988, Cancer Research 
4£: 1943-1948) can be implanted into nude mice spleens. 
In these cases, the animals generally develop hepatic 

20 tumors. Thus, such animals serve as models of not only 
colorectal tumor progression but also as models of 
hepatic metastases. 

For the generation of animal models of prostate 
cancer tumor progression, cells derived from, for 

25 example, the high metastatic potential prostatic cell 
line PC-3-M or the non-metastatic cell line DU 145 
(Karmali, R.A. et al. . 1987, Anticancer Res. 2:1173-1180; 
Koziowski, J.M. et al, . 1984, Cancer Research 14.:3522- 
3529) may be implanted into the prostates of animals and 

30 the resulting tumors may be analyzed and compared to, for 
example, normal tissue. In such a manner, genes which 
are differentially expressed in neoplastic versus normal 
cells as well as versus metastatic cells may be 
identified. 
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Th role f identified gene products (e^su, 030 
gene products) can be determined by transfecting cDNAs 
encoding such gene products into the appropriate cell 
line and analyzing its effect on the cells' ability to 
5 induce tumor progression in animal models such as these. 
The role of the identified gene products may be further 
analyzed by, for example, culturing cells derived from 
the tumors which develop in the animal models, 
introducing these cultured cells into animals, and 
10 subsequently measuring the level of identified gene 
product present in the resulting tumor cells. In this 
manner, cell line variants are developed which can be 
useful in analyzing the role of quantitative and/or 
qualitative differences in the expression of the 
15 identified genes on the cells' ability to induce tumor 

progression. For example, as demonstrated, below, m the 
Example presented in Section 6, 030 gene expression is 
inversely related to the metastatic potential of the 
tumor cell line used to generate such a tumor progression 

20 animal model. ^ 

Additionally, recombinant animal models exhibiting 

tumor progression characteristics and/or symptoms of 

tumor progression disorders, including metastatic 

diseases, can be utilized, for example, such well-known 

25 animal models as the transgenic mouse model for human 

melanoma and transgenic mice which carry specific 

mutations which result in multiple intestinal tumors 

(Mints, M. and Silvers W.K. , 1993, Proc. Natl. Acad. Sci. 

USA 8817-8821; and Fodde, R. , «t al. . 1994, Proc. 

30 Natl. Acad. Sci. USA 21:8969-8973). Further, recombinant 

animal models for tumor progression can be engineered by 

utilizing, for example, target gene sequences such as 

thos described, above, in Section 5.4, in conjunction 

with technigu s for producing transgenic animals that ar 

35 well known to those f skill in th art. For example. 
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target gene & quences can be introduc d into, and 
overexpr seed in, th g none of th animal of interest, 
or, if endogenous target gene sequences are present, they 
can either be overexpressed or, alternatively, can be 
5 disrupted in order to underexpress or inactivate target 
gene expression. 

In order to overexpress a target gene sequence , 
the coding portion of the target gene sequence can be 
ligated to a regulatory sequence which is capable of 

10 driving gene expression in the animal and cell type of 
interest. Such regulatory regions will be well known to 
those of skill in the art, and can be utilized in the 
absence of undue experimentation. 

In order to obtain underexpression of an 

15 endogenous target gene sequence, such a sequence can be 
introduced into the genome of the animal of interest such 
that the endogenous target gene alleles will be 
inactivated. Preferably, an engineered sequence 
comprising at least part of the target gene sequence is 

20 utilized and is introduced, via gene targeting, such that 
the endogenous target sequence is disrupted upon 
integration of the engineered target gene sequence into 
the animal's genome. Gene targeting is discussed, below, 
in this Section. 

25 Animals of any species, including, but not limited 

to, mice, rats, rabbits, guinea pigs, pigs, micro-pigs, 
goats, and non-human primates, e.g. , baboons, monkeys, 
and chimpanzees can be used to generate animal models of 
tumor progression and tumor progression disorders, such 

30 as, for example, metastatic diseases. 

Any technique known in the art can be used to 
introduce a target gene transgene into animals to produce 
the founder lines of transgenic animals. Such techniques 
includ , but are not limited to pronuclear microinjection 

35 (Hopp , P.C. and Wagn r, T.E., 1989, U.S. Pat. No. 
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4,873,191); retrovirus m diated gene transfer into germ 
lines (Van der Putten et flit, 1985, Proc. Natl. Acad. 
Sci., USA SZz 6148-6152); gene targeting in embryonic stem 
cells (Thompson g^-aj^., 1989, Cell 56.: 313-321); 
5 electroporation of embryos (Lo, 1983, Mol Cell. Biol. 

2:1803-1814); and sperm-mediated gene transfer (Lavitrano 
et al. . 1989, Cell 5^:717-723); etc. For a review of 
such techniques, see Gordon, 1989, Transgenic Animals, 
Intl. Rev. cytol. US: 171-229. 
10 The present invention provides for transgenic 

animals that carry the transgene in all their cells, as 
well as animals which carry the transgene in some, but 
not all their cells, L^, mosaic animals. The transgene 
can be integrated, either as a single transgene or in 
15 concatamers, g^, head-to-head tandems or head-to-tail 
tandems. The transgene can also be selectively 
introduced into and activated in a particular cell type 
by following, for example, the teaching of Lasko et: al. 
(Lasko, H. , ^ al. . 1992, Proc. Natl. Acad. Sci. USA 
20 £9_: 6232-6236) . The regulatory sequences required for 

such a cell-type specific activation will depend upon the 
particular cell type of interest, and will be apparent to 
those of skill in the art. 

When it is desired that the target gene transgene 
25 be integrated into the chromosomal site of the endogenous 
target gene, gene targeting is preferred. Briefly, when 
such a technique is to be utilized, vectors containing 
some nucleotide sequences homologous to the endogenous 
target gene of interest are designed for the purpose of 
30 integrating, via homologous recombination with 

chromosomal sequences, into and disrupting the function 
of, the nucleotide sequence of the endogenous target 
gene. The transgene can also be selectively introduced 
into a particular cell type, thus inactivating the 
35 endogenous gene of interest in only that c 11 type, by 



WO 96/30389 



PCTAJS96/04413 



- 67 - 

following, for exampl , th t aching of Gu et ah (Gu, 
H., et al. . 1994, Sci nc 103-106) . The r gulatory 

sequences required for such a cell-type specific 
inactivation will depend upon the particular cell type of 
5 interest, and will be apparent to those of skill in the 
art. 

Once transgenic animals have been generated, the 
expression of the recombinant target gene and protein can 
be assayed utilizing standard techniques. Initial 

10 screening can be accomplished by Southern blot analysis 
or PCR techniques to analyze animal tissues to assay 
whether integration of the transgene has taken place. 
The level of mRNA expression of the transgene in the 
tissues of the transgenic animals can also be assessed 

15 using techniques which include, but are not limited to, 
Northern blot analysis of tissue samples obtained from 
the animal, in situ hybridization analysis, and RT- 
coupled PCR. Samples of target gene-expressing tissue, 
can also be evaluated immunocytochemically using 

20 antibodies specific for the transgenic product of 
interest. 

The target gene transgenic animals that express 
target gene mRNA or target gene transgene peptide 
(detected immunocytochemically, using antibodies directed 

25 against target gene product epitopes) at easily 

detectable levels should then be further evaluated to 
identify those animals which display tumor progression 
state characteristics, including tumor progression 
disorder symptoms. Such tumor progression disorder 

30 characteristics and/ or symptoms can include, for example, 
those associated with such tumor cells as found in human 
melanoma, breast, gastrointestinal, such as esophageal, 
stomach, colon, bowel, colorectal and rectal cancers, 
prostate, bladder, testicular, ovarian, uterin , 

35 c rvical, brain, lung, bronchial, larynx, pharynx, liv r, 
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pancreatic, thyroid, bone, leukemias, lymphomas and 
various types of skin cancers. 

Additionally, specific cell types within the 
transgenic animals can be analyzed for cellular 
5 phenotypes characteristic of tumor progression. Such 
cellular phenotypes can include, for example, 
differential gene expression characteristic of cells 
within a given tumor progression state of interest. 
Further, such cellular phenotypes can include as 
10 assessment of a particular cell type fingerprint pattern 
of expression and its comparison to known fingerprint 
expression profiles of the particular cell type in 
animals exhibiting tumor progression. Such transgenic 
animals serve as suitable model systems for tumor 
15 progression disorders. 

Once target gene transgenic founder animals are 
produced ( i.e. . those animals which express target gene 
proteins in cells or tissues of interest, and which, 
preferably, exhibit tumor progression characteristics) , 
20 they can be bred, inbred, outbred, or crossbred to 

produce colonies of the particular animal. Examples of 
such breeding strategies include but are not limited to: 
outbreeding of founder animals with more than one 
integration site in order to establish separate lines; 
25 inbreeding of separate lines in order to produce compound 
target gene transgenics that express the target gene 
transgene of interest at higher levels because of the 
effects of additive expression of each target gene 
transgene; crossing of heterozygous transgenic animals to 
30 produce animals homozygous for a given integration site 
in order to both augment expression and eliminate the 
possible need for screening of animals by DNA analysis; 
crossing of separate homozygous lines to produce compound 
heterozygous or horn zyg us lin s; br ding animals to 
35 differ nt inbred gen tic backgrounds so as to examin 
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effects of modifying allel s on expression of the target 
gen transgene and the development of symptoms for tumor 
progression disorders. One such approach is to cross the 
target gene transgenic founder animals with a wild type 
5 strain to produce an Fl generation that exhibits symptoms 
for tumor progression disorders. The Fl generation can 
then be inbred in order to develop a homozygous line, if 
it is found that homozygous target gene transgenic 
animals are viable. 



10 5.7.2. CELL -BASED ARf^vp 

Cells that contain and express target gene 
sequences which encode target gene protein, and, further, 
exhibit cellular phenotypes associated with tumor 
progression disorders, can be utilized to identify 
15 compounds that exhibit an ability to prevent and/or 

ameliorate tumor progression. Cellular phenotypes which 
can indicate an ability to ameliorate symptoms of tumor 
progression disorders can include, for example, tumor 
cells with low or high metastatic potential. 
20 Further, the fingerprint pattern of gene 

expression of cells of interest can be analyzed and 
compared to the normal fingerprint pattern. Those 
compounds which cause cells exhibiting cellular 
phenotypes of tumor progression disorders, including 
25 metastatic diseases, to produce a fingerprint pattern 
more closely resembling a normal fingerprint pattern for 
the cell of interest can be considered candidates for 
further testing regarding an ability to ameliorate the 
symptoms of such diseases. 
30 Cells which will be utilized for such assays can, 

for example, include non-recombinant cell lines, such, as, 
but not limited to, melanoma ( e.g. . B16 Fl and B16 F2.0 
cell lines) , human colon ( e.g. . KM12c and KM20L4 cell- 
lines) , prostate re. a. . DU 145 and PC-3-M c 11 lines) and 
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breast cane r cell lines (fi^su, MCF-7 and MDA-MB-435 cell 
lines) . in addition, purified primary or secondary tumor 
cells derived from either transgenic or non-transgenic 
tumor cells can be used. 
5 Further, cells which can be used for such assays 

can also include recombinant, transgenic cell lines. For 
example, the metastatic disease animal models of the 
invention, discussed, above, in Section 5.2.4.1, can be 
used to generate cell lines, containing one or more cell 
10 types involved in metastatic diseases, that can be used 
as cell culture models for these disorders. While 
primary cultures derived from the metastasis in 
transgenic animals of the invention can be utilized, the 
generation of continuous cell lines is preferred. For 
15 examples of techniques which can be used to derive a 
continuous cell line from the transgenic animals, see 
Small <»t al. . 1985, Mol. Cell Biol. 5:642-648. 

Alternatively, cells of a cell type known to be 
involved in metastatic diseases can be transfected with 
20 sequences capable of increasing or decreasing the amount 
of target gene expression within the cell. For example, 
target gene sequences can be introduced into, and over 
expressed in, the genome of the cell of interest, or, if 
endogenous target gene sequences are present, they can 
either be overexpressed or, alternatively, be disrupted 
in order to underexpress or inactivate target gene 
expression. 

In order to overexpress a target gene sequence, 
the coding portion of the target gene sequence can be 
ligated to a regulatory sequence which is capable of 
driving gene expression in the cell type of interest. 
Such regulatory regions will be well known to those of 
skill in the art, and can be utilized in the absence of 
undue experimentation. 



25 
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For under expr ssion of an endog nous targ t gen 
sequenc , such a sequence can be isolat d and engin er d 
such that when reintroduced into the genome of the cell 
type of interest, the endogenous target gene alleles will 
5 be inactivated. Preferably, the engineered target gene 
sequence is introduced via gene targeting such that the 
endogenous target sequence is disrupted upon integration 
of the engineered target gene sequence into the cell's 
genome. Gene targeting is discussed, above, in Section 
10 5.7.1. 

Transfection of target gene sequence nucleic acid 
can be accomplished by utilizing standard techniques. 
See, for example, Ausubel, 1989, supra . Transfected 
cells should be evaluated for the presence of the 

15 recombinant target gene sequences, for expression and 

accumulation of target gene mRNA, and for the presence of 
recombinant target gene protein production. In instances 
wherein a decrease in target gene expression is desired, 
standard techniques can be used to demonstrate whether a 

20 decrease in endogenous target gene expression and/ or in 
target gene product production is achieved. 



5*8. SCREENING ASSAYS FOR COMPOUNDS THAT 

INTERACT WITH THE GgWB FEPPTCT 

The following assays are designed to identify 

25 compounds that bind to target gene products, bind to 

other cellular proteins that interact with a target gene 

product, and to compounds that interfere with the 

interaction of the target gene product with other 

cellular proteins. 

30 Such compounds can include, but are not limited 

to, other cellular proteins. Specifically, such 

compounds can include, but are not limited to, peptides, 

such as, for example, soluble peptides, including, but 

not limit d to Ig-tailed fusion peptides, comprising 

35 xtracellular portions of targ t gene product 
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transmembran r ceptors, and members of rand m peptide 
libraries (see, e.g. . Lam, K.S. efr alt/ 1991, Nature 
3_5±:82-84; Houghton, R. at al.. 1991, Nature 251: 84-86) , 
made of D-and/or L-conf iguration amino acids, 
5 phosphopeptides (including, but not limited to, members 
of random or partially degenerate phosphopeptide 
libraries; see, e.g. . Songyang, Z. et fll. , 1993, Cell 
22:767-778), antibodies (including, but not limited to, 
polyclonal, monoclonal, humanized, anti- idiotypic, 
10 chimeric or single chain antibodies, and FAb, F(ab') 2 and 
FAb expression libary fragments, and epitope-binding 
fragments thereof) , and small organic or inorganic 
molecules . 

Compounds identified via assays such as those 

15 described herein can be useful, for example, in 

elaborating the biological function of the target gene 
product, and for ameliorating symptoms of tumor 
progression. In instances, for example, whereby a tumor 
progression state or disorder results from a lower 

20 overall level of target gene expression, target gene 
product, and/or target gene product activity in a cell 
involved in the tumor progression state or disorder , 
compounds that interact with the target gene product can 
include ones which accentuate or amplify the activity of 

25 the bound target gene protein. Such compounds would 

bring about an effective increase in the level of target 
gene activity, thus ameliorating symptoms of the tumor 
progression disorder or state. In instances whereby 
mutations within the target gene cause aberrant target 

30 gene proteins to be made which have a deleterious effect 
that leads to tumor progression, compounds that bind 
target gene protein can be identified that inhibit the 
activity of the bound target gene protein. Assays for 
testing th ff ctiven ss of compounds, id ntifi d by, 

3 5 for example, techniques such as those described in 
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Section 5.8.1-5.8.3, ar discuss d, below, in S ction 
5.8.4. 



s.ft.i-TN VITRO SCREENING ASSAYS FOR COMPOTOPP 
THAT BIND TO A TARGET GENE PRODUCT 

5 In vitro systems can be designed to identify 

compounds capable of binding the target gene products of 
the invention. Compounds identified can be useful, for 
example, in modulating the activity of wild type and/ or 
mutant target gene products, preferably mutant target 

10 gene proteins, can be useful in elaborating the 

biological function of the target gene product, can be 
utilized in screens for identifying compounds that 
disrupt normal target gene interactions, or can in 
themselves disrupt such interactions. 

15 The principle of the assays used to identify 

compounds that bind to the target gene product involves 
preparing a reaction mixture of the target gene protein 
and the test compound under conditions and for a time 
sufficient to allow the two components to interact and 

20 bind, thus forming a complex which can be removed and/or 
detected in the reaction mixture. These assays can be 
conducted in a variety of ways. For example, one method 
to conduct such an assay would involve anchoring target 
gene product or the test substance onto a solid phase and 

25 detecting target gene product/test compound complexes 
anchored on the solid phase at the end of the reaction. 
In one embodiment of such a method, the target gene 
product can be anchored onto a solid surface, and the 
test compound, which is not anchored, can be labeled, 

30 either directly or indirectly. 

In practice, microtitre plates can conveniently be 
utilized as the solid phase. The anchored component can 
be immobilized by non-covalent or covalent attachments. 
N n-coval nt attachmentcan be accomplish d by simply 
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coating the solid surface with a solution of the protein 
and drying. Alternatively, an immobilized antibody, 
preferably a monoclonal antibody, specific for the 
protein to be immobilized can be used to anchor the 
5 protein to the solid surface. The surfaces can be 
prepared in advance and stored. 

in order to conduct the assay, the nonimmobilized 
component is added to the coated surface containing the 
anchored component. After the reaction is complete, 
10 unreacted components are removed (e^, by washing) under 
conditions such that any complexes formed will remain 
immobilized on the solid surface. The detection of 
complexes anchored on the solid surface can be 
accomplished in a number of ways. Where the previously 
15 immobilized component is pre-labeled, the detection of 

label immobilized on the surface indicates that complexes 
were formed. Where the previously nonimmobilized 
component is not pre-labeled, an indirect label can be 
used to detect complexes anchored on the surface; 
20 using a labeled antibody specific for the immobilized 

component (the antibody, in turn, can be directly labeled 
or indirectly labeled with a labeled anti-lg antibody) . 

Alternatively, a reaction can be conducted in a 
liquid phase, the reaction products separated from 
25 unreacted components, and complexes detected; &*<U, using 
an immobilized antibody specific for target gene or the 
test compound to anchor any complexes formed in solution, 
and a labeled antibody specific for the other component 
of the possible complex to detect anchored complexes. 

,n 5.8.2. ASSAYS FOR CELLULAR PROTBIHS THAT 

Any method suitable for detecting protein-protein 
int ractions can b employed for id ntifying novel target 
product-cellular or extracellular protein interact! ns. 
35 Th s m thods are outlined in Section 5.1.3., supES, for 
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th id ntification of pathway g nes, and can b utilized 
her in with respect to the identification of proteins 
which interact with identified target proteins. In such 
a case, the target gene serves as the known "bait" gene. 

5 5.8.3. ASSAYS FOR COMPOUNDS THAT 

INTERFERE WITH TARGET 
gEKg/CEI.I.PTA y PttOPUCT INTERACTION 

The target gene products of the invention can, An 

vivo , interact with one or more cellular or extracellular 

10 macromolecules, such as proteins. Such macromolecules 
include, but are not limited to, nucleic acid molecules 
and those products identified via methods such as those 
described, above, in section 5.8.2. For the purposes of 
this discussion, such cellular and extracellular 

15 macromolecules are referred to herein as "binding 

partners." Compounds that disrupt such interactions can 
be useful in regulating the activity of the target gene 
product, especially mutant target gene products. Such 
compounds can include, but are not limited to molecules 

20 such as antibodies, peptides, and the like described in 
Section 5.3.1. above. 

The basic principle of the assay systems used to 
identify compounds that interfere with the interaction 
between the target gene product and its cellular or 

25 extracellular binding partner or partners involves 

preparing a reaction mixture containing the target gene 
product, and the binding partner under conditions and for 
a time sufficient to allow the two products to interact 
and bind, thus forming a complex. In order to test a 

30 compound for inhibitory activity, the reaction mixture is 
prepared in the presence and absence of the test 
compound. The test compound can be initially included in 
the reaction mixture, or can be added at a time 
subsequent to the addition of target gene and its 

35 cellular or extracellular binding partner. Control 
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reaction mixtures ar incubat d without th test compound 
or with a placebo. The formation of any complex s 
between the target gene product and the cellular or 
extracellular binding partner is then detected. The 
5 formation of a complex in the control reaction, but not 
in the reaction mixture containing the test compound, 
indicates that the compound interferes with the 
interaction of the target gene product and the 
interactive binding partner. Additionally, complex 
10 formation within reaction mixtures containing the test 
compound and normal target gene product can also be 
compared to complex formation within reaction mixtures 
containing the test compound and mutant target gene 
product. This comparison can be important in those cases 
15 wherein it is desirable to identify compounds that 

disrupt interactions of mutant but not normal target gene 
products . 

The assay for compounds that interfere with the 
interaction of the target gene products and binding 
20 partners can be conducted in a heterogeneous or 
homogeneous format. Heterogeneous assays involve 
anchoring either the target gene product or the binding 
partner onto a solid phase and detecting complexes 
anchored on the solid phase at the end of the reaction. 
25 in homogeneous assays, the entire reaction is carried out 
in a liquid phase. In either approach, the order of 
addition of reactants can be varied to obtain different 
information about the compounds being tested. For 
example, test compounds that interfere with the 
30 interaction between the target gene products and the 

binding partners, s^su, by competition, can be identified 
by conducting the reaction in the presence of the test 
substance; i^., by adding the test substance to the 
reaction mixture prior to or simultan ously with the 
35 targ t gene product and int ractive cellular or 
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extracellular binding partner. Alternatively, test 
compounds that disrupt preformed complexes, e.g. 
compounds with higher binding constants that displace one 
of the components from the complex, can be tested by 
5 adding the test compound to the reaction mixture after 
complexes have been formed. The various formats are 
described briefly below. 

In a heterogeneous assay system, either the target 
gene product or the interactive cellular or extracellular 

10 binding partner, is anchored onto a solid surface, while 
the non-anchored species is labeled, either directly or 
indirectly. In practice, microtitre plates are 
conveniently utilized. The anchored species can be 
immobilized by non-covalent or covalent attachments. 

15 Non-covalent attachment can be accomplished simply by 
coating the solid surface with a solution of the target 
gene product or binding partner and drying. 
Alternatively, an immobilized antibody specific for the 
species to be anchored can be used to anchor the species 

20 to the solid surface. The surfaces can be prepared in 
advance and stored. 

In order to conduct the assay, the partner of the 
immobilized species is exposed to the coated surface with 
or without the test compound. After the reaction is 

25 complete, unreacted components are removed ( e.g. . toy 

washing) and any complexes formed will remain immobilized 
on the solid surface. The detection of complexes 
anchored on the solid surface can be accomplished ±n a 
number of ways. Where the non- immobilized species is 

30 pre-labeled, the detection of label immobilized on the 
surface indicates that complexes were formed. Wheire the 
non- immobilized species is not pre-labeled, an indirect 
label can be used to detect complexes anchored on the 
surface; e.g. . using a labeled antibody specific for the 

35 initially n n-immobilized species (the antibody, in turn, 
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can b directly labeled or indirectly labeled with a 
labeled anti-Ig antibody) . Depending upon the order of 
addition of reaction components, test compounds which 
inhibit complex formation or which disrupt preformed 
5 complexes can be detected. 

Alternatively, the reaction can be conducted in a 
liquid phase in the presence or absence of the test 
compound, the reaction products separated from unreacted 
components, and complexes detected; e.g. . using an 
10 immobilized antibody specific for one of the binding 
components to anchor any complexes formed in solution, 
and a labeled antibody specific for the other partner to 
detect anchored complexes. Again, depending upon the 
order of addition of reactants to the liquid phase, test 
15 compounds which inhibit complex or which disrupt 
preformed complexes can be identified • 

In an alternate embodiment of the invention, a 
homogeneous assay can be used. In this approach, a 
preformed complex of the target gene product and the 
20 interactive cellular or extracellular binding partner 
product is prepared in which either the target gene 
products or their binding partners are labeled, but the 
signal generated by the label is quenched due to complex 
formation (see, e.g. , U.S. Patent No. 4,109,496 by 
25 Rubenstein which utilizes this approach for 

immunoassays) . The addition of a test substance that 
competes with and displaces one of the species from the 
preformed complex will result in the generation of a 
signal above background. In this way, test substances 
30 which disrupt target gene product-cellular or 

extracellular binding partner interaction can be 
identified. 

In a particular embodiment, the target gene 
product can be prepared for immobilization using 
35 recombinant DNA techniques described in Secti n 5-1.2, 
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supra . For example, the targ t gen coding r gion can b 
fused to a glutathione-S-transf erase (GST) gene using a 
fusion vector such as pGEX-5X-l, in such a manner that 
its binding activity is maintained in the resulting 
5 fusion product. The interactive cellular or 

extracellular product can be purified and used to raise a 
monoclonal antibody, using methods routinely practiced in 
the art and described above, in Section 5.2.4. This 
antibody can be labeled with the radioactive isotope 125 I, 

10 for example, by methods routinely practiced in the art. 
In a heterogeneous assay, e.g. . the GST-Target gene 
fusion product can be anchored to glutathione-agarose 
beads. The interactive cellular or extracellular binding 
partner product can then be added in the presence or 

15 absence of the test compound in a manner that allows 
interaction and binding to occur. At the end of the 
reaction period, unbound material can be washed away, and 
the labeled monoclonal antibody can be added to the 
system and allowed to bind to the complexed components. 

20 The interaction between the target gene product and the 
interactive cellular or extracellular binding partner can 
be detected by measuring the amount of radioactivity that 
remains associated with the glutathione-agarose beads. A 
successful inhibition of the interaction by the test 

25 compound will result in a decrease in measured 
radioactivity . 

Alternatively , the GST- target gene fusion product 
and the interactive cellular or extracellular binding 
partner product can be mixed together in liquid in the 

30 absence of the solid glutathione-agarose beads. The test 
compound can be added either during or after the binding 
partners are allowed to interact. This mixture can then 
be added to the glutathione-agarose beads and unbound 
material is washed away. Again the extent of inhibition 

35 of the binding partner interaction can be detected by 
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adding "th lab led antibody and measuring the 
radioactivity associated with the beads. 

In another embodiment of the invention, these same 
techniques can be employed using peptide fragments that 
5 correspond to the binding domains of the target gene 
product and the interactive cellular or extracellular 
binding partner (in case where the binding partner is a 
product) , in place of one or both of the full length 
products. Any number of methods routinely practiced in 
10 the art can be used to identify and isolate the protein's 
binding site. These methods include, but are not limited 
to, mutagenesis of one of the genes encoding one of the 
products and screening for disruption of binding in a co- 
immunoprecipitation assay. Compensating mutations in the 
15 gene encoding the second species in the complex can be 
selected. Sequence analysis of the genes encoding the 
respective products will reveal the mutations that 
correspond to the region of the product involved in 
interactive binding. Alternatively, one product can be 
20 anchored to a solid surface using methods described in 

this Section above, and allowed to interact with and bind 
to its labeled binding partner, which has been treated 
with a proteolytic enzyme, such as trypsin. After 
washing, a short, labeled peptide comprising the binding 
25 domain can remain associated with the solid material, 
which can be isolated and identified by amino acid 
sequencing. Also, once the gene coding for the cellular 
or extracellular binding partner product is obtained, 
short gene segments can be engineered to express peptide 
30 fragments of the product, which can then be tested for 
binding activity and purified or synthesized. 
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5.8.4. ASSAYS FOR AMELIORATION OF 
TIIMOR PROGRESSION SYMPTOMS 

Any of the binding compounds, including but not 
limited to, compounds such as those identified in the 
5 foregoing assay systems, can be tested for the ability to 
prevent and/or ameliorate symptoms of tumor progression 
and tumor progression disorders, including metastatic 
disease. Cell-based and animal model-based assays for 
the identification of compounds exhibiting an ability to 
10 prevent and/or ameliorate tumor progression symptoms are 
described below. 

First, cell-based systems such as those described, 
above, in Section 5.7.2, can be used to identify 
compounds which can act to ameliorate symptoms of tumor 
15 progression . For example, such cell systems can be 

exposed to a compound, suspected to exhibiting an ability 
to ameliorate tumor progression symptoms, at a sufficient 
concentration and for a time sufficient to elicit such an 
amelioration in the exposed cells. After exposure, the 
20 cells are examined to determine whether one or more tumor 
progression state or tumor progression disorder 
phenotypes has been altered to resemble a more normal or 
more wild-type, non-neoplastic disease phenotype. 

Taking, as an example, tumor progression involving 
25 metastasis, cell-based systems such as the highly 

metastatic B16 F10 melanoma cell line can be utilized. 
Upon exposure to such cell systems, compounds can be 
assayed for their ability to reduce the metastatic 
potential of such cells. Further, the level of 03 0 gene 
3 0 expression within these cells may be assayed. 

Presumably, an increase in the observed level of 0 30 gene 
expression would indicate an amelioration of the 
metastatic tumor progression state. 

In addition, animal-based systems, such as those 
35 described, above, in Section 5.7.1, can b used to 

identify comp unds capable of ameliorating symptoms of 
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tumor progr ssion. Such animal models can be used as 
test substrates for the identification of drugs, 
pharmaceuticals, therapies, and interventions which can 
be effective in treating tumor progression disorders. 
For example, animal models can be exposed to a compound 
suspected to exhibit an ability to ameliorate tumor 
progression symptoms, at a sufficient concentration and 
for a time sufficient to elicit such an amelioration in 
the exposed animals. The response of the animals to the 
exposure can be monitored by assessing the reversal of 
disorders associated with tumor progression. With regard 
to intervention, any treatments which reverse any aspect 
of symptoms of tumor progression, such as, for example, 
those associated with metastatic disease, should be 
15 considered as candidates for human therapeutic 

intervention in the treatment of tumor progression. 
Dosages of test agents can be determined by deriving 
dose-response curves, as discussed in Section 5.10, 
below. 

20 Further, gene expression patterns can be utilized 

to assess the ability of a compound to ameliorate 
symptoms of tumor progression and tumor progression 
disorders. For example, fingerprint gene expression or a 
fingerprint pattern can then be used in such an 

25 assessment. Fingerprint gene expression and fingerprint 
patterns are described, below, in Section 5.11. 

Fingerprint patterns can be characterized for 
known states { e.g. . normal or known pre-neoplastic , 
neoplastic or metastatic states) within the cell- and/or 

30 animal-based model systems. Subsequently, these known 
fingerprint patterns can be compared to ascertain the 
effect a test compound has to modify such fingerprint 
patterns, and to cause the pattern to more closely 
resemble that of a mor desirable fingerprint patt rn. 
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For example , administration of a compound can 
cause the f ing rprint pattern of a metastatic diseas 
model system to more closely resemble a control, normal 
system. Administration of a compound can, alternatively, 
5 cause the fingerprint pattern of a control system to 
begin to mimic tumor progression states, such as 
metastatic disease states. 

5*8.5* MONITORING OF EFFECTS 
DURING CLINICAL TRIAL8 

10 Monitoring the influence of compounds on tumor 

progression can be applied not only in basic drug 
screening, but also in clinical trials. In such clinical 
trials, the expression of a panel of genes that have been 
discovered in any one of the paradigms discovered in 

15 Section 5.1.1.1 can be used as a "read out" of the tumor 
progression state of a particular cell. 

For example, and not by way of limitation, the 
paradigm describing the B16 melanoma cells provides for 
the identification of fingerprint genes ( e.g. , 030) that 

20 are down-regulated in metastatic tumor cells. For 

example, in a clinical trial, tumor cells can be isolated 
from the primary tumors removed by surgery, and RNA 
prepared and analyzed by differential display as 
described in Section 6.1. The levels of expression of 

25 the fingerprint genes can be quantified by Northern blot 
analysis or RT-PCR, as described in Section 6.1, or 
alternatively by measuring the amount of protein 
produced, by one of the methods described in Section 
5.7.2. In this way, the fingerprint profiles can serve 

30 as putative biomarkers indicative of the metastatic 
potential of the tumor cell. Thus, by monitoring the 
level of expression of romy030, a protocol for suitable 
chemotherapeutic anticancer drugs can b developed based 
on the metastatic potential of tumor cells in th 

35 primary. In cases of inop rable metastatic dis ase, 
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patients can have biopsies removed for measurement of 
romy030 expression so that the drug's efficacy can be 
measured by monitoring the degree of restored expression 
of romy030. 

5 5.9. COMPOUNDS AND METHODS FOR 

Tfi ^TMRNT OF TP) f™> PROCRgflSIOM 

Described herein are methods and compositions 
which can be used ameliorate symptoms of tumor 
progression and disorders involving tumor progression 

10 via, first, target gene modulation, and/or second, via a 
depletion of the cells involved in tumor progression. 
Target gene modulation can be of a positive or negative 
nature, depending on the specific situation involved, but 
each modulatory event yields a net result in which tumor 

15 progression symptoms are ameliorated. 

••Negative modulation," as used herein, refers to a 
reduction in the level and/or activity of target gene 
product relative to the level and/ or activity of the 
target gene product in the absence of the modulatory 

20 treatment. 

"Positive modulation," as used herein, refers to 
an increase in the level and/ or activity of target gene 
product relative to the level and/ or activity of target 
gene product in the absence of modulatory treatment. 

25 It is possible that tumor progression can be 

brought about, at least in part, by an abnormal level of 
gene product, or by the presence of a gene product 
exhibiting abnormal activity. As such, the reduction in 
the level and/ or activity of such gene products would 

30 bring about the amelioration of tumor progression 
symptoms. Negative modulatory techniques for the 
reduction of target gene expression levels or target gene 
product activity levels are discussed in Section 5.9.1, 
b low. 



WO 96/30389 



PCT/US96/04413 



- 85 - 

Alternatively, it is p ssible that tumor 
progression can be brought about, at 1 ast in part, by 
the absence or reduction of the level of gene expression, 
or a reduction in the level of a gene product's activity. 
5 As such, an increase in the level of gene expression 
and/ or the activity of such gene products would bring 
about the amelioration of tumor progression symptoms. 

For example, as demonstrated in the Example 
presented in Section 6, below, a reduction in the level 

10 of 030 gene expression correlates with a highly 

metastatic tumor progression state. A 030 positive 
modulatory technique which increased 030 gene expression 
in cells within a highly metastatic tumor progression 
state should, therefore, act to ameliorate the symptoms 

15 of such a state. Further, because the 030 gene product 
may exhibit general tumor suppressor features, it is 
possible that a 030 positive modulatory technique could 
ameliorate symptoms of many tumor progression events. 

Positive modulatory techniques for increasing the 

20 target gene expression levels or target gene product 
activity levels are discussed in Section 5.9.2, below. 

Additionally, tumor progression treatment 
techniques whereby the concentration of cells involved in 
tumor progression are depleted are described, below, in 

25 Section 5.9.3. 

Among the tumor progression events which may be 
treated are those associated with human tumors. Such 
human tumors may include, for example, human melanomas, 
breast, gastrointestinal, such as esophageal, stomach, 

3 0 colon, bowel, colorectal and rectal cancers, prostate, 
bladder, testicular, ovarian, uterine, cervical, brain, 
lung, bronchial, larynx, pharynx, liver, pancreatic, 
thyroid, bone, leukemias, lymphomas and various types of 
skin cancers. 
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5.9.1. Wttft&TTVE MODULATORY TECHNIQUES 
As discussed, above, successful treatment of tumor 
progression symptoms and of disorders involving tumor 
progression can be brought about by techniques which 
5 serve to inhibit the expression or activity of target 
gene products. 

For example, compounds such as those identified 
through assays described, above, in Section 5.8, which 
exhibit negative modulatory activity, can be used in 
10 accordance with the invention to prevent and/ or 

ameliorate symptoms of tumor progression, including tumor 
progression involving metastatic disorders. As discussed 
in Section 5.8., above, such molecules can include, but 
are not limited to peptides, phosphopeptides, small 
15 organic or inorganic molecules, or antibodies (including, 
for example, polyclonal, monoclonal, humanized, anti- 
idiotype, chimeric or single chain antibodies, and FAb, 
F(ab')2 and FAb expression library fragments, and 
epitope-binding fragments thereof) . Negative modulatory 
20 techniques involving antibody administration are 

described, below, in Section 5.9.1.2. Techniques for the 
determination and administration of such compounds are 
described, below, in Section 5.10. 

Further, ant i sense and ribozyme molecules which 
25 inhibit expression of the target gene can also be used in 
accordance with the invention to reduce the level of 
target gene expression, thus effectively reducing the 
level of target gene activity. Still further, triple 
helix molecules can be utilized in reducing the level of 
30 target gene activity. Such techniques are described, 
below, in Section 5.9.1.1. 
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5.9.1.1. NEGATIVE MODULATORY ANT I SENSE, 

RIBOZYME AND TRIPLE HELIX APPROACHES 

Among the compounds which can exhibit the ability 
to prevent and/or ameliorate symptoms of tumor 
5 progression are antisense, ribozyme, and triple helix 
molecules . Such molecules can be designed to reduce or 
inhibit either wild type, or if appropriate, mutant 
target gene activity. Techniques for the production and 
use of such molecules are well known to those of skill in 
10 the art. 

Anti-sense RNA and DNA molecules act to directly 
block the translation of mRNA by hybridizing to targeted 
mRNA and preventing protein translation. With respect to 
antisense DNA, oligodeoxyribonucleotides derived from the 

15 translation initiation site, e.g. . between the -10 and 
+10 regions of the Target gene nucleotide sequence of 
interest, are preferred. 

Ribozymes are enzymatic RNA molecules capable of 
catalyzing the specific cleavage of RNA. (For a review, 

20 see, for example, Rossi, J., 1994, Current Biology £:469- 
471) . The mechanism of ribozyme action involves sequence 
specific hybridization of the ribozyme molecule to 
complementary target RNA, followed by a endonucleolytic 
cleavage. The composition of ribozyme molecules must 

25 include one or more sequences complementary to the target 
gene mRNA and must include the well-known catalytic 
sequence responsible for mRNA cleavage. For this 
sequence, see U.S. Pat. No. 5,093,246. As such within 
the scope of the invention are engineered hammerhead 

30 motif ribozyme molecules that specifically and 

efficiently catalyze endonucleolytic cleavage of RNA 
sequences encoding target gene proteins. 

Specific ribozyme cleavage sites within any 
potential RNA target are initially identified by scanning 

35 the m 1 cul of interest for ribozyme cl avage sites 

which include the following sequences, GUA, GUU, and GUC. 
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Once identified, short RNA sequences of between 15 and 20 
ribonucleotides corresponding to the region of the target 
gene containing the cleavage site can be evaluated for 
predicted structural features, such as secondary 
5 structure, that can render the oligonucleotide sequence 
unsuitable. The suitability of candidate sequences can 
also be evaluated by testing their accessibility to 
hybridization with complementary oligonucleotides, using 
ribonuclease protection assays. 
10 Nucleic acid molecules to be used in triplex helix 

formation for the inhibition of transcription should be 
single stranded and composed of deoxynucleotides. The 
base composition of these oligonucleotides must be 
designed to promote triple helix formation via Hoogsteen 
15 base pairing rules, which generally require sizeable 

stretches of either purines or pyrimidines to be present 
on one strand of a duplex. Nucleotide sequences can be 
pyrimidine-based, which will result in TAT and CGC* 
triplets across the three associated strands of the 
20 resulting triple helix. The pyrimidine-rich molecules 

provide base complementer ily to a purine-rich region of a 
single strand of the duplex in a parallel orientation to 
that strand. In addition, nucleic acid molecules can be 
chosen that are purine-rich, for example, contain a 
25 stretch of G residues. These molecules will form a 

triple helix with a DNA duplex that is rich in GC paris, 
in which the majority of the purine residues are located 
on a single strand of the targeted duplex, resulting in 
GGC triplets across the three strands in the triplex. 
30 Alternatively, the potential sequences that can b 

targeted for triple helix formation can be increased by 
creating a so called switchback" nucleic acid molecule. 
Switchback molecules are synthesized in an alternating 
5,-3/, 3 #-5/ manner, such that th y base pair with first 
35 one strand of a dupl x and th n the other, eliminating 
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th necessity for a siz able stretch of eith r purines or 
pyrimidines to be present on one strand of a duplex. 

In instances wherein the antisense, ribozyme, 
and/or triple helix molecules described herein are 
5 utilized to reduce or inhibit mutant gene expression, it 
is possible that the technique utilized can also 
efficiently reduce or inhibit the transcription (triple 
helix) and/or translation (antisense, ribozyme) of mRNA 
produced by normal target gene alleles such that the 
10 possibility can arise wherein the concentration of normal 
target gene product present can be lower than is 
necessary for a normal phenotype. In such cases , to 
ensure that substantially normal levels of target gene 
activity are maintained, nucleic acid molecules that 
15 encode and express target gene polypeptides exhibiting 
normal target gene activity can be introduced into cells 
via gene therapy methods such as those described, below, 
in Section 5.9,2 that do not contain sequences 
susceptible to whatever antisense, ribozyme, or triple 
2 0 helix treatments are being utilized. Alternatively, in 
instances whereby the target gene encodes an 
extracellular protein, it can be preferable to 
coadminister normal target gene protein into the cell or 
tissue in order to maintain the requisite level of 
25 cellular or tissue target gene activity. 

Anti-sense RNA and DNA, ribozyme and triple helix 
molecules of the invention can be prepared by any method 
known in the art for the synthesis of DNA and RNA 
molecules. These include techniques for chemically 
30 synthesizing oligodeoxyribonucleotides and 

oligoribonucleotides well known in the art such as, for 
example, solid phase phosphoramidite chemical synthesis. 
Alternatively, RNA molecules can be generated by in vitro 
and In Vivg transcription of DNA sequences encoding th 
35 antisense RNA molecul . Such DNA sequences can be 
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incorporat d into a wid variety of vectors which 
incorporate suitable RNA polymerase promoters such as the 
T7 or SP6 polymerase promoters. Alternatively , antisense 
cDNA constructs that synthesize antisense RNA 
5 constitutively or inducibly, depending on the promoter 
used, can be introduced stably into cell lines. 

Various well-known modifications to the DNA 
molecules can be introduced as a means of increasing 
intracellular stability and half -life. Possible 

10 modifications include but are not limited to the addition 
of flanking sequences of ribo- or deoxy- nucleotides to 
the 5 # and/or 3 f ends of the molecule or the use of 
phosphorothioate or 2' O-methyl rather than phospho- 
diesterase linkages within the oligodeoxyribonucleotide 

15 backbone. 

5.9.1.2. NEGATIVE M ODPIaATQRY ANTIBODY TECHNIQUES 
Antibodies can be generated which are both 
specific for target gene product and which reduce target 
gene product activity. Such antibodies may, therefore, 

20 by administered in instances whereby negative modulatory 
techniques are appropriate for the treatment of tumor 
progression. Antibodies can be generated using standard 
techniques described in Section 5.6, above, against the 
proteins themselves or against peptides corresponding to 

25 portions of the proteins. The antibodies include but are 
not limited to polyclonal, monoclonal, Fab fragments, 
single chain antibodies, chimeric antibodies, and the 
like. 

In instances where the target gene protein to 
30 which the antibody is directed is intracellular and whole 
antibodies are used, internalizing antibodies can be 
preferred. However, lipofectin or liposomes can be used 
to deliver the antibody or a fragment f th Fab region 
which binds to the target gene epitope into c lis . Where 
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fragments of the antibody are used, the smallest 
inhibitory fragment which binds to the target protein's 
binding domain is preferred. For example, peptides 
having an amino acid sequence corresponding to the domain 
5 of the variable region of the antibody that binds to the 
target gene protein can be used. Such peptides can be 
synthesized chemically or produced via recombinant DNA 
technology using methods well known in the art ( e.g. r see 
Creighton, 1983, supra : and Sambrook et ah . 1989, 

10 sjiEEa) . Alternatively, single chain neutralizing 
antibodies which bind to intracellular target gene 
product epitopes can also be administered. Such single 
chain antibodies can be administered, for example, by 
expressing nucleotide sequences encoding single-chain 

15 antibodies within the target cell population by 
utilizing, for example, techniques such as those 
described in Marasco et al. (Marasco, W. et ah , 1993, 
Proc. Natl. Acad. Sci. USA 2JD: 7889-7893 ) . 

In instances where the target gene protein is 

20 extracellular, or is a transmembrane protein, any of the 
administration techniques described, below in Section 
5.10 which are appropriate for peptide administration can 
be utilized to effectively administer inhibitory target 
gene antibodies to their site of action. 

25 POSITIVE MODUIATPRY TBCHyiOtTgfl 

As discussed above, successful treatment of tumor 
progression symptoms and of disorders involving tumor 
progression can be brought about by techniques which 
serve to increase the level of target gene expression or 

30 to increase the activity of a target gene product* 

For example, compounds such as those identified 
through assays described, above, in Section 5.8, which 
exhibit positive modulatory activity can b used in 
accordance with th invention to am liorat tumor 



WO 96/30389 



PCT/US96/04413 



progression symptoms. As discussed in Section 5*8, 
above, such molecules can include, but are not limited 
to, peptides, phosphopeptides, small organic or inorganic 
molecules, or antibodies (including, for example, 
5 polyclonal, monoclonal, humanized, anti-idiotypic , 

chimeric or single chain antibodies, and FAb, F(ab') 2 and 
FAb expression library fragments, and epitope-binding 
fragments thereof) . Positive modulatory techniques 
involving antibody administration are described, below, 

10 in Section 5.9.2.1. 

For example, a target gene protein, at a level 
sufficient to ameliorate tumor progression symptoms can 
be administered to a patient exhibiting such symptoms. 
Any of the techniques discussed, below, in Section 5*10, 

15 can be utilized for such administration* One of skill in 
the art will readily know how to determine the 
concentration of effective, non-toxic doses of the normal 
target gene protein, utilizing techniques such as those 
described, below, in Section 5. 10.1. 

20 In instances wherein the compound to be 

administered is a peptide compound, DNA sequences 
encoding the peptide compound can, alternatively, be 
directly administered to a patient exhibiting tumor 
progression symptoms, at a concentration sufficient to 

25 generate the production of an amount of target gene 
product adequate to ameliorate tumor progression 
symptoms. Any of the techniques described, below, in 
Section 5*10, which achieve intracellular administration, 
can be utilized for the administration of such DNA 

30 molecules. The DNA molecules can be produced, for 
example, by well-known recombinant techniques. 

In the case of peptide compounds which act: 
extrac llularly, th DNA molecules encoding such peptides 
can be taken up and expressed by any c 11 type, so long 

35 as a sufficient circulating cone ntration of p ptide 
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results f r the elicitation of a reduction in tumor 
progression symptoms. 

In the case of compounds which act 
intracellular ly, the DNA molecules encoding such peptides 
5 must be taken up and expressed by cells involved in the 
tumor progression at a sufficient level to bring about 
the reduction of tumor progression symptoms. 

Any technique which serves to selectively 
administer DNA molecules to a cell involved in tumor 

10 progression is, therefore, preferred for the DNA 
molecules encoding intracellular ly acting peptides. 

Further, patients can be treated for symptoms of 
tumor progression by gene replacement therapy. One or 
more copies of a normal target gene or a portion of the 

15 gene that directs the production of a normal target gene 
protein with target gene function can be inserted into 
cells, using vectors which include, but are not limited 
to adenovirus, adeno-associated virus, and retrovirus 
vectors, in addition to other particles that introduce 

20 DNA into cells, such as liposomes. Techniques such as 
those described above can be utilized for the 
introduction of normal target gene sequences into human 
cells. 

In instances wherein the target gene encodes an 
25 extracellular, secreted gene product, such gene 

replacement techniques may be accomplished either in vivo 
or in vitro. For such cases, the cell types expressing 
the target gene is less important than achieving a 
sufficient circulating concentration of the extracellular 
30 molecules for the amelioration of tumor progression 

symptoms to occur. In vitro, target gene sequences can 
be introduced into autologous cells. Those cells 
expressing the target gene sequence of interest can then 
be reintroduced, preferably by intravenous 
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administration, into the patient such that ther results 
an amelioration of tumor progression symptoms. 

In instances wherein the gene replacement involves 
a gene which encodes a product which acts 
5 intracellular ly, it is preferred that gene replacement be 
accomplished in vivo. Further, because the cell type in 
which the gene replacement must occur is the cell type 
involved in tumor progression, such techniques must 
successfully target such tumor progression cells. 

10 Taking the 030 gene as an example, an increase in 

030 expression can serve to ameliorate tumor progression 
symptoms, such as, for example, tumor progression 
symptoms involving metastatic processes. Therefore, any 
positive modulatory described herein which increases the 

15 030 gene product or gene product activity to a level 
which is sufficient to ameliorate tumor progression 
symptoms represents a successful tumor progression 
therapeutic treatment. 

5.9.3. METHODS FOR DEPLETING CELXS 
20 TKVOLVED TW TUMOR PROGRESSION 

Techniques described herein can be utilized to 
deplete the total number of cells involved in tumor 
progression, thus effectively decreasing the ratio of the 
tumor cells to non-cancerous cells. Specifically, 

25 separation techniques are described which can be used to 
deplete the total number of tumor cells present within a 
cell population, and, further, targeting techniques are 
described which can be utilized to deplete specific tumor 
cell subpopulations. 

30 Depending on the particular application, changing 

the number of cells belonging to tumor cell population 
can yield inhibitory responses leading to the 
am lioration of cane rous disorders. 

The s paration t chniques described herein are 

35 based on th presence or absence of specific cell 
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surface, preferably transmembran , markers. By way of 
example, and not by way of limitation, the techniques 
described herein utilize tumor specific cell surface 
markers or antigens and will describe procedures whereby 
5 tumor cells can be separated from other cells, thus 
allowing for selective depletion of tumor cells. 

Separation techniques can be utilized which 
separate and purify cells, tumor cells, for example, in 
vitro from a population of cells, such as hematopoietic 

10 cells autologous to the patient being treated. For 

example, an initial tumor cell subpopulation-containing 
population of cells, such as hematopoietic cells, can be 
obtained from a leukemia patient using standard 
procedures well known to those of skill in the art. 

15 Peripheral blood can be utilized as one potential 
starting source for such techniques, and can, for 
example, be obtained via venipuncture and collection into 
heparinized tubes. 

Once the starting source of autologous cells is 

20 obtained, tumor cells can be removed, and thus 

selectively separated and purified, by various methods 
which utilize antibodies which bind specific markers 
present on tumor cells while absent on other cells within 
the starting source. These techniques can include, for 

25 example, flow cytometry using a fluorescence activated cell- 
sorter (FACS) and specific f luorochromes, biotin-avidin 
or biotin-streptavidin separations using biotin 
conjugated to cell surface marker-specific antibodies and 
avidin or streptavidin bound to a solid support such as 

3 0 affinity column matrix or plastic surfaces or magnetic 
separations using antibody-coated magnetic beads* 

Separation via antibodies for specific markers can 
be by negative or positive selection procedures. In 
negative separation, antibodies ar us d which are 

35 specific for markers present on undesired cells, in this 
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cas tumor cells, which exhibit, for example, the tumor 
specific cell surface marker. Cells bound by an antibody 
to such a cell surface marker can be removed or lysed and 
the remaining desired mixture retained. In positive 
5 separation, antibodies specific for markers present on 
the desired cells of interest, in this case tumor-like 
cells, are used. Cells bound by the antibody are 
separated and retained. It will be understood that 
positive and negative separations can be used 
10 substantially simultaneously or in a sequential manner. 

A common technique for antibody based separation 
is the use of flow cytometry such as by a florescence 
activated cell sorter (FACS) . Typically, separation by 
flow cytometry is performed as follows. The suspended 
15 mixture of cells are centrifuged and resuspended in 

media. Antibodies which are conjugated to fluorochrome 
are added to allow the binding of the antibodies to 
specific cell surface markers. The cell mixture is then 
washed by one or more centrifugation and resuspension 
20 steps. The mixture is run through a FACS which separates 
the cells based on different fluorescence 
characteristics. FACS systems are available in varying 
levels of performance and ability, including multi-color 
analysis. The facilitating cell can be identified by a 
25 characteristic profile of forward and side scatter which 
is influenced by size and granularity, as well as by 
positive and/or negative expression of certain cell 
surface markers. 

Other separation techniques besides flow cytometry 
30 can also provide fast separations. One such method is 
biotin-avidin based separation by affinity 
chromatography. Typically, such a technique is performed 
by incubating cells with biotin-coupled antibodies to 
specific markers, such as, for example, the transnembran 
35 protein encod d by the tumor-specific marker, followed by 
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passage through an avidin column. Biotin-antibody-cell 
complex s bind to the column via the biotin-avidin 
interaction, while other cells pass through the column. 
The specificity of the biotin-avidin system is well 
5 suited for rapid positive separation. Multiple passages 
can ensure separation of a sufficient level of the tumor 
cell subpopulation of interest. 

Xn instances whereby the goal of the separation 
technique is to deplete the overall number of cells 

10 belonging to the tumor cell subpopulation, the cells 

derived from the starting source of cells which has now 
been effectively depleted of tumor cells can be 
reintroduced into the patient. Such a depletion of the 
tumor cell subpopulation results in the amelioration of 

15 cancerous disorders associated with tumor progression. 

In instances whereby the goal of the separation 
technique is to augment or increase the overall number of 
cells belonging to a non-cancerous cell subpopulation, 
cells derived from the purified normal cell subpopulation 

20 can be reintroduced into the patient, thus resulting in 
the amelioration of cancerous disorders associated with 
an under activity of the normal cell subpopulation. 

The cells to be reintroduced will be cultured and 
expanded ex vivo prior to reintroduction. Purified 

25 normal cell subpopulation cells can be washed, suspended 
in, for example, buffered saline, and reintroduced into 
the patient via intravenous administration. 

Cells to be expanded can be cultured, using 
standard procedures, in the presence of an appropriate 

30 expansion agent which induces proliferation of the 

purified normal cell subpopulation. Such an expansion 
agent can, for example, be any appropriate cytokine, 
antigen, or antibody. 

Prior to being reintroduced into a patient, the 

35 purified normal cells can be modified by, for example, 
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transformation with gene sequ nces encoding g n products 
of interest. Such gene products should r present 
products which enhance the activity of the purified 
normal cell subpopulation or, alternatively, represent 
5 products which repress the activity of one or more of the 
other normal cell subpopulations . Cell transformation 
and gene expression procedures are well known to those of 
skill in the art, and can be as those described, above, 
in Section 5.2. 

10 well-known targeting methods can, additionally, be 

utilized in instances wherein the goal is to deplete the 
number of cells belonging to a specific tumor cell 
subpopulation. Such targeting methods can be in vivo or 
in vitro, and can involve the introduction of targeting 
15 agents into a population of cells such that the targeting 
agents selectively destroy a specific subset of the cells 
within the population. In vivo administration techniques 
which can be followed for such targeting agents are 
described, below, in Section 5.10. 
20 Targeting agents generally comprise, first, a 

targeting moiety which, in the current instance, causes 
the targeting agent to selectively associate with a 
specific tumor cell subpopulation. The targeting agents 
generally comprise, second, a moiety capable of 
25 destroying a cell with which the targeting agent has 
become associated. 

Targeting moieties can include, but are not 
limited to, antibodies directed to cell surface markers 
found specifically on the tumor cell subpopulation being 
30 targeted, or, alternatively, to ligands, such as growth 
factors, which bind receptor-type molecules found 
exclusively on the targeted tumor cell subpopulation. 

Destructive moieties include any moiety capable of 
inactivating or destroying a cell t which the targeting 
35 ag nt has become bound. For example, a destructive 
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moi ty can include, but it is not limit d to cytotoxins 
or radioactive agents. Cytotoxins includ , for example, 
plant-, fungus-, or bacteria-derived toxins, with 
deglycosylated Ricin A chain toxins being generally 
5 preferred due to their potency and lengthy half -lives. 

5*10. PHARMACEUTICAL PREPARATIONS 

AND METHODS OF ADMINISTRATION 

The identified compounds that inhibit target gene 

expression, synthesis and/ or activity can be administered 

10 to a patient at therapeutically effective doses to 

prevent, treat or ameliorate tumor progression. A 

therapeutically effective dose refers to that amount of 

the compound sufficient to result in amelioration of 

symptoms of tumor progression. 

15 5.10. 1. EFFECTIVE DOSE 

Toxicity and therapeutic efficacy of such 
compounds can be determined by standard pharmaceutical 
procedures in cell cultures or experimental animals, 
e.o. . for determining the LD 50 (the dose lethal to 50% of 

20 the population) and the ED S0 (the dose therapeutically 
effective in 5 0% of the population). The dose ratio 
between toxic and therapeutic effects is the therapeutic 
index and it can be expressed as the ratio LDsq/EDso* 
Compounds which exhibit large therapeutic indices are 

25 preferred. While compounds that exhibit toxic side 
effects can be used, care should be taken to design a 
delivery system that targets such compounds to the site 
of affected tissue in order to minimize potential damage 
to uninfected cells and, thereby, reduce side effects. 

30 The data obtained from the cell culture assays and 

animal studies can be used in formulating a range of 
dosage for use in humans. The dosage of such compounds 
lies pref rably within a range of circulating 
concentrati ns that include the ED 50 with littl or no 
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toxicity. The dosag can vary within this rang 
depending upon the dosage form employed and the route of 
administration utilized. For any compound used in the 
method of the invention, the therapeutically effective 
5 dose can be estimated initially from cell culture assays. 
A dose can be formulated in animal models to achieve a 
circulating plasma concentration range that includes the 
IC 50 f i.e. . the concentration of the test compound which 
achieves a half -maximal inhibition of symptoms) as 
10 determined in cell culture. Such information can be used 
to more accurately determine useful doses in humans. 
Levels in plasma can be measured, for example, by high 
performance liquid chromatography. 

5.10.2. FOHMUIAT TOWB MTO QBE 

15 Pharmaceutical compositions for use in accordance 

with the present invention can be formulated in 
conventional manner using one or more physiologically 
acceptable carriers or excipients. 

Thus, the compounds and their physiologically 
20 acceptable salts and solvates can be formulated for 
administration by inhalation or insufflation (either 
through the mouth or the nose) or oral, buccal, 
parenteral . or rectal administration. 

For oral administration, the pharmaceutical 
25 compositions can take the form of, for example, tablets 
or capsules prepared by conventional means with 
pharmaceutical ly acceptable excipients such as binding 
agents r e. a. . pregelatinised maize starch, 
polyvinylpyrrolidone or hydroxypropyl methylcellulose) ; 
30 fillers r e.a. . lactose, microcrystalline cellulose or 
calcium hydrogen phosphate) ; lubricants (e^u., magnesium 
stearate, talc or silica) ; disintegrants (s^j., potato 
starch or sodium starch glycolate) ; or wetting agents 
f e.a. . sodium lauryl sulphate) . The tablets can b 
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coated by methods well known in th art. Liquid 
preparations for oral administration can take the form 
of , for example, solutions, syrups or suspensions, or 
they can be presented as a dry product for constitution 
5 with water or other suitable vehicle before use. Such 
liquid preparations can be prepared by conventional means 
with pharmaceutically acceptable additives such as 
suspending agents ( e.g. , sorbitol syrup, cellulose 
derivatives or hydrogenated edible fats) ; emulsifying 

10 agents ( e.g. , lecithin or acacia) ; non-aqueous vehicles 
( e.g. . almond oil, oily esters, ethyl alcohol or 
fractionated vegetable oils) ; and preservatives (g,g, , 
methyl or propyl-p-hydroxybenzoates or sorbic acid) . The 
preparations can also contain buffer salts, flavoring, 

15 coloring and sweetening agents as appropriate. 

Preparations for oral administration can be 
suitably formulated to give controlled release of the 
active compound. 

For buccal administration the compositions can 

20 take the form of tablets or lozenges formulated in 
conventional manner. 

For administration by inhalation, the compounds 
for use according to the present invention are 
conveniently delivered in the form of an aerosol spray 

25 presentation from pressurized packs or a nebulizer, with 
the use of a suitable propel lant, e»a. . 
dichlorodif luoromethane, trichlorof luoromethane , 
dichlorotetraf luoroethane, carbon dioxide or other 
suitable gas. In the case of a pressurized aerosol the 

30 dosage unit can be determined by providing a valve to 
deliver a metered amount. Capsules and cartridges of 
e.g. , gelatin for use in an inhaler or insufflator can be 
formulated containing a powder mix of the compound and a 
suitable powder base such as lactose or starch. 
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The compounds can be formulated for parenteral 
administration by injection, e.g. . by bolus injection or 
continuous infusion. Formulations for injection can be 
presented in unit dosage form, e.g. . in ampoules or in 
5 multi-dose containers, with an added preservative. The 
compositions can take such forms as suspensions, 
solutions or emulsions in oily or aqueous vehicles, and 
can contain formulatory agents such as suspending, 
stabilizing and/or dispersing agents. Alternatively, the 
10 active ingredient can be in powder form for constitution 
with a suitable vehicle, e.g. . sterile pyrogen-free 
water, before use. 

The compounds can also be formulated in rectal 
compositions such as suppositories or retention enemas, 
15 e.g. , containing conventional suppository bases such as 
cocoa butter or other glycerides. 

In addition to the formulations described 
previously, the compounds can also be formulated as a 
depot preparation. Such long acting formulations can be 
20 administered by implantation (for example, subcutaneously 
or intramuscularly) or by intramuscular injection . Thus, 
for example, the compounds can be formulated with 
suitable polymeric or hydrophobic materials (for example 
as an emulsion in an acceptable oil) or ion exchange 
25 resins, or as sparingly soluble derivatives, for example, 
as a sparingly soluble salt. 

The compositions can, if desired, be presented in 
a pack or dispenser device which can contain one or more 
unit dosage forms containing the active ingredient* The 
30 pack can for example comprise metal or plastic foil, such 
as a blister pack. The pack or dispenser device can be 
accompanied by instructions for administration. 
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5.11. DIAGNOSIS OF TUMO R PROGRESSION 

A variety of methods can b employed for the 
diagnosis of tumor progression and of disorders involving 
tumor progression, including metastatic diseases. Such 
5 methods can, for example, utilize reagents such as 
fingerprint gene nucleotide sequences described in 
Sections 5.2.1, and antibodies directed against 
differentially expressed and pathway gene peptides, as 
described, above, in Section 5.2.1 (peptides) and 5.2.3 

10 (antibodies) . Specifically, such reagents can be used, 
for example, for the detection of the presence of target 
gene mutations, or the detection of either over or under 
expression of target gene in RNA. 

The methods described herein can be performed, for 

15 example, by utilizing pre-packaged diagnostic kits 

comprising at least one specific fingerprint gene nucleic 
acid or anti-fingerprint gene antibody reagent described 
herein, which can be conveniently used, e.g. , in clinical 
settings, to diagnose patients exhibiting symptoms of 

20 metastatic diseases. 

Any cell type or tissue, preferably T-cells, in 
which the fingerprint gene is expressed can be utilized 
in the diagnostics described below. 

5.11.1. DETECTION OF FINGERPRINT 
25 GENE NUCLEIC ACIDS 

DNA or RNA from the cell type or tissue to be 

analyzed can easily be isolated using procedures which 

are well known to those in the art. Diagnostic 

procedures can also be performed "in £i£ll w directly upon 

30 tissue sections (fixed and/or frozen) of patient tissue 
obtained from biopsies or resections, such that no 
nucleic acid purification is necessary. Nucleic acid 
reagents such as those described in Section 5.1 can be 
us d as probes and/ or primers for such in situ procedures 

3 5 (see, for example, Nuovo, G.J. , 1992, PCR in situ 



WO 96/30389 



PCT/US96/04413 



- 104 - 

hybridization: Protocols and Applications, Raven Press, 
NY) . 

Fingerprint gene nucleotide sequences, either RNA 
or DNA, can, for example, be used in hybridization or 

5 amplification assays of biological samples to detect gene 
structures and expression associated with metastasis, 
such assays can include, but are not limited to. Southern 
or Northern analyses, single stranded conformational 
polymorphism analyses, In situ hybridization assays, and, 
10 polymerase chain reaction analyses. Such analyses can 
reveal both quantitative aspects of the expression 
pattern of the fingerprint gene, and qualitative aspects 
of the fingerprint gene expression and/or gene 
composition. That is, such techniques can include, for 
15 example, point mutations, insertions, deletions, 
chromosomal rearrangements, and/or activation or 
inactivation of gene expression. 

Preferred diagnostic methods for the detection of 
fingerprint gene-specific nucleic acid molecules can 

20 involve for example, contacting and incubating nucleic 
acids, derived from the cell type or tissue being 
analyzed, with one or more labeled nucleic acid reagents 
as are described in Section 5.1, under conditions 
favorable for the specific annealing of these reagents to 

25 their complementary sequences within the nucleic acid 
molecule or interest. Preferably, the lengths of these 
nucleic acid reagents are at least 15 to 30 nucleotides. 
After incubation, all non-annealed nucleic acids are 
removed from the nucleic acid: fingerprint RNA molecule 

30 hybrid. The presence of nucleic acids from the target 
tissue which have hybridized, if any such molecules 
exist, is then detected. Using such a detection scheme, 
the nucleic acid from the tissue or cell type of interest 
can be immobiliz d, for xample, to a solid support such 

35 as a membrane, or a plastic surface such as that on a 
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microtitre plate or polystyrene beads. In this cas , 
after incubation, non-annealed, lab 1 d fingerprint 
nucleic acid reagents of the type described in Section 
5.1 are easily removed. Detection of the remaining, 
5 annealed, labeled nucleic acid reagents is accomplished 
using standard techniques well-known to those in the art. 

Alternative diagnostic methods for the detection 
of fingerprint gene specific nucleic acid molecules can 
involve their amplification, e.q t , by PCR (the 
10 experimental embodiment set forth in Mullis, K.B., 1987, 
U.S. Patent No. 4,683,202), ligase chain reaction 
(Barany, F. , 1991, Proc. Natl. Acad. Sci. USA 88_:189- 
193), self sustained sequence replication (Guatelli, J.C. 
et ai, , 1990, Proc. Natl. Acad. Sci. USA £7:1874-1878), 
15 transcriptional amplification system (Kwoh, D.Y et al, , 
1989, Proc. Natl. Acad. Sci. USA 8j£: 1173-1177) , Q-Beta 
Replicase (Lizardi, P.M. et , 1988, Bio/Technology 
6:1197), or any other nucleic acid amplification method, 
followed by the detection of the amplified molecules 
20 using techniques well known to those of skill in the art. 
These detection schemes are especially useful for the 
detection of nucleic acid molecules if such molecules are 
present in very low numbers. 

In one embodiment of such a detection scheme, a 
25 cDNA molecule is obtained from an RNA molecule of 
interest f e.a. . by reverse transcription of the RNA 
molecule into cDNA) . Cell types or tissues from which 
such RNA can be isolated include any tissue in which wild 
type fingerprint gene is known to be expressed. A 
30 sequence within the cDNA is then used as the template for 
a nucleic acid amplification reaction, such as a PCR 
amplification reaction, or the like. The nucleic acid 
reagents used as synthesis initiation reagents ( e t q, , 
primers) in the revers transcription and nucl ic acid 
35 amplification st ps of this method are chosen from among 
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the fingerprint gene nucleic acid reagents d scribed in 
Section 5.1. The preferred lengths of such nucleic acid 
reagents are at least 19-30 nucleotides. For detection 
of the amplified product, the nucleic acid amplification 
5 can be performed using radioactively or non-radioactively 
labeled nucleotides. Alternatively, enough amplified 
product can be made such that the product can be 
visualized by standard ethidium bromide staining or by 
utilizing any other suitable nucleic acid staining 
10 method. 

In addition to methods which focus primarily on 
the detection of one nucleic acid sequence, fingerprint 
profiles, as discussed in Section 5.3.4., can also be 
assessed in such detection schemes. Fingerprint profiles 

15 can be generated, for example, by utilizing a 

differential display procedure, as discussed above in 
5.1.1.2, Northern analysis and/ or RT-PCR. Any of the 
gene sequences described, above, in Section 5.2.1 can be 
used as probes and/ or PCR primers for the generation and 

20 corroboration of such fingerprint profiles. 

5.11.2. nT fTKeTTOif or targe t s ™ e PEPTIPBg 
Antibodies directed against wild type or mutant 
fingerprint gene peptides, which are discussed, above, in 
Section 5.2.3, can also be used in tumor progression 

25 diagnostics and prognostics, as described, for example, 
herein, such diagnostic methods, can be used to detect 
abnormalities in the level of fingerprint gene protein 
expression, or abnormalities in the structure and/or 
tissue, cellular, or subcellular location of 

30 fingerprinting gene protein. Structural differences can 
include, for example, differences in the size, 
electronegativity, or antigenicity of the mutant 
fingerprint gene protein relative to the normal 
fingerprint gene protein. 
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Protein from th tissue or cell typ to be 
analyzed can easily be isolat d using techniques which 
are well known to those of skill in the art. The protein 
isolation methods employed herein can, for example, be 
5 such as those described in Harlow and Lane (Harlow, E. 
and Lane, D., 1988, "Antibodies: A Laboratory Manual," 
Cold Spring Harbor Laboratory Press, Cold Spring Harbor, 
Hew York) . 

Preferred diagnostic methods for the detection of 
10 wild type or mutant fingerprint gene peptide molecules 
can involve, for example, immunoassays wherein 
fingerprint gene peptides are detected by their 
interaction with an anti-fingerprint gene specific 
peptide antibody. 
15 For example, antibodies, or fragments of 

antibodies, such as those described, above, in Section 
5.2.3, useful in the present invention can be used to 
quantitatively or qualitatively detect the presence of 
wild type or mutant fingerprint gene peptides. This can 
20 be accomplished, for example, by immunofluorescence 
techniques employing a fluorescent ly labeled antibody 
(see below) coupled with light microscopic, flow 
cytometric, or fluorimetric detection. Such techniques 
are especially preferred if the fingerprint gene peptides 
25 are expressed on the cell surface. 

The antibodies (or fragments thereof) useful in 
the present invention can, additionally, be employed 
histologically, as in immunofluorescence or 
immunoelectron microscopy, for in situ detection of 
3 0 target gene peptides, in sitli detection can be 

accomplished by removing a histological specimen from a 
patient, and applying thereto a labeled antibody of the 
present invention. The antibody (or fragment) is 
preferably applied by overlaying the labeled antibody (or 
35 fragment) onto a biological sample. Through the use of 
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such a procedure, it is possible to determine not only 
the presence of the fingerprint gene peptides, but also 
their distribution in the examined tissue. Using the 
present invention, those of ordinary skill will readily 
5 perceive that any of a wide variety of histological 

methods (such as staining procedures) can be modified in 
order to achieve such in si£H detection. 

Immunoassays for wild type or mutant fingerprint 
gene peptides typically comprise incubating a biological 
10 sample, such as a biological fluid, a tissue extract, 
freshly harvested cells, or cells which have been 
incubated in tissue culture, in the presence of a 
detectably labeled antibody capable of identifying 
fingerprint gene peptides, and detecting the bound 
15 antibody by any of a number of techniques well-known in 
the art. 

The biological sample can be brought in contact 
with and immobilized onto a solid phase support or 
carrier such as nitrocellulose, or other solid support 
20 which is capable of immobilizing cells, cell particles or 
soluble proteins. The support can then be washed with 
suitable buffers followed by treatment with the 
detectably labeled fingerprint gene specific antibody. 
The solid phase support can then be washed with the 
25 buffer a second time to remove unbound antibody. The 
amount of bound label on solid support can then be 
detected by conventional means. 

By "solid phase support or carrier" is intended 
any support capable of binding an antigen or an antibody. 
30 Well-known supports or carriers include glass, 

polystyrene, polypropylene, polyethylene, dextran, nylon, 
amylases, natural and modified celluloses, 
polyacrylamides, gabbros, and magnetite. The nature of 
tti carrier can b either s lubl to some extent or 
35 insoluble for the purposes of the present inv nti n. The 
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support material can have virtually any possible 
structural configuration so long as the coupled molecule 
is capable of binding to an antigen or antibody. Thus, 
the support configuration can be spherical, as in a bead, 
5 or cylindrical, as in the inside surface of a test tube, 
or the external surface of a rod. Alternatively, the 
surface can be flat such as a sheet, test strip, etc. 
Preferred supports include polystyrene beads. Those 
skilled in the art will know many other suitable carriers 

10 for binding antibody or antigen, or will be able to 
ascertain the same by use of routine experimentation. 

The binding activity of a given lot of anti-wild 
type or mutant fingerprint gene peptide antibody can be 
determined according to well-known methods. Those 

15 skilled in the art will be able to determine operative 
and optimal assay conditions for each determination by 
employing routine experimentation. 

One of the ways in which the fingerprint gene 
peptide-specific antibody can be detectably labeled is by 

20 linking the same to an enzyme and use in an enzyme 
immunoassay (EIA) (voller, A. , "The Enzyme Linked 
Immunosorbent Assay (ELXSA) , " Diagnostic Horizons 2:1-7, 
1978) (Microbiological Associates Quarterly Publication, 
Walkersville, MD) ; Voller, A. g£_&L,., J. Clin. Pathol. 

25 31:507-520 (1978); Butler, J.E., Meth. Enzymol . 73:482- 
523 (1981); Maggio, E. (ed.), ENZYME IMMUNOASSAY, CRC 
Press, Boca Raton, FL, 1980; Ishikawa, E. et al . > (eds.) 
ENZYME IMMUNOASSAY, Kgaku Shoin, Tokyo, 1981) . The 
enzyme which is bound to the antibody will react with an 

30 appropriate substrate, preferably a chromogenic 

substrate, in such a manner as to produce a chemical 
moiety which can be detected, for example, by 
spectrophotometric, fluorimetric or by visual means - 
Enzymes which can be used to detectably label the 
35 antibody include, but are not limited to, malat 
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dehydrog nase, staphylococcal nucl ase, delta-5-steroid 
isomerase, yeast alcohol dehydrogenase, alpha- 
glycerophosphate, dehydrogenase, triose phosphate 
isomerase, horseradish peroxidase, alkaline phosphatase, 
5 asparaginase, glucose oxidase, beta-galactosidase, 
ribonuclease, urease, catalase, glucose-6-phosphafce 
dehydrogenase, glucoamylase and acetylcholinesterase. 
The detection can be accomplished by colorimetric methods 
which employ a chromogenic substrate for the enzyme. 
10 Detection can also be accomplished by visual comparison 
of the extent of enzymatic reaction of a substrate in 
comparison with similarly prepared standards. 

Detection can also be accomplished using any of a 
variety of other immunoassays. For example, by 
15 radioactively labeling the antibodies or antibody 

fragments, it is possible to detect fingerprint gene wild 
type or mutant peptides through the use of a 
radioimmunoassay (RIA) (see, for example, Weintraub, B., 
Principles of Radioimmunoassays , Seventh Training Course 
20 on Radioligand Assay Techniques, The Endocrine Society, 
March, 1986) . The radioactive isotope can be detected by 
such means as the use of a gamma counter or a 
scintillation counter or by autoradiography. 

It is also possible to label the antibody with a 
25 fluorescent compound. When the f luorescently labeled 
antibody is exposed to light of the proper wave length, 
its presence can then be detected due to fluorescence. 
Among the most commonly used fluorescent labeling 
compounds are fluorescein isothiocyanate, rhodamine, 
3 0 phycoerythrin, phycocyanin, allophycocyanin, s- 
phthaldehyde and f luorescamine . 

The antibody can also be detectably labeled using 
fluorescence emitting metals such as 152 Eu, or others of 
the lanthanide series. These metals can be attached to 
35 the antibody using such m tal chelating groups as 



WO 96/30389 



PCT/US96/04413 



- Ill - 

diethylenetriaminepentacetic acid (DTPA) or 
ethylenediaminetetraac tic acid (EDTA) . 

The antibody also can be detectably labeled by 
coupling it to a chemi luminescent compound. The presence 
5 of the chemiluminescent-tagged antibody is then 

determined by detecting the presence of luminescence that 
arises during the course of a chemical reaction. 
Examples of particularly useful chemiluminescent labeling 
compounds are luminol, isoluminol, theromatic acridinium 

10 ester, imidazole, acridinium salt and oxalate ester. 

Likewise, a bioluminescent compound can be used to 
label the antibody of the present invention. 
Bioluminescence is a type of chemiluminescence found in 
biological systems in r which a catalytic protein 

15 increases the efficiency of the chemiluminescent 

reaction. The presence of a bioluminescent protein is 
determined by detecting the presence of luminescence. 
Important bioluminescent compounds for purposes of 
labeling are luciferin, lucif erase and aequorin. 

20 6. EXAMPLE: IDENTIFICATION AMD 

CHARACTERIZATION OF A NOVEL GENE 
THAT INHIBITS TUMOR PROGRESSION 

In the Example presented in this Section, ttie in 

v j L ^ yo paradigm , described , above , in Section 5.1.1.1, was 

25 utilized to identify a gene, designated herein as the 030 
gene, which is differentially expressed in cells with a 
high metastatic potential relative to cells having a low 
metastatic potential. Specifically, the 030 gene is 
expressed in high metastatic potential cells at a irate 

30 which is many-fold lower than it is expressed in non- 

metastatic cells. Thus, as discussed below, the 03 0 gene 
can encode a product important to a number of neoplastic 
processes, including, for example, the progression of a 
cell to a metastatic state, th aggr ssiven ss of a 

35 cell's metastatic stat , and the ability of a primary 
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tumor cell to invad surrounding tissue. Given th 
differential 03 0 gen expression pattern r veal d in this 
Section, the 03 0 gene product can represent a protein 
having tuxnor suppressor or inhibitor function. 

5 6.1. MATERIALS AND METHODS 

6.1.1. <?EMt CTOTPRB 
B16 Fl and B16 F10 melanoma cell lines were 
maintained in culture in Eagle's minimal essential medium 
(MEM) supplemented with 10% fetal calf serum. Cells were 

10 harvested from nonconfluent monolayers by a two minute 
treatment with 0.25% trypsin and 2inM EDTA. 

For further characterization of in vivo activity, 
each cell line was injected into mice. Cells were washed 
two times in MEM, and the final cell suspension adjusted 

15 to 5 x 10 5 cells per ml in MEM- Two hundred microliters 
of this cell suspension (1 x 10 5 cells) was injected i.v. 
into the lateral tail vein of C57BL/6J mice. After three 
weeks, the mice were sacrificed and their lungs 
autopsied. The number of pulmonary tumors was determined 

20 by counting surface nodules using a dissecting 
microscope. 

The differential expression of the 030 gene in B16 
Fl relative to B16 F10 cell lines was compared with the 
extent of pulmonary metastases which developed in B16 Fl- 
25 injected mice relative to B16 FlO-injected mice. 

Differential mRNA display was carried out as 
described , above, in Section 5.1.1.2. Details of the 
differential display are given, below. 
30 Ryft ygQlqtjon 

RNA was isolated, using RNAzol, from nonconfluent 
monolayers of B16 Fl and B16 F10 cell lines. 
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Isolat d RNA was resusp nded in DEPC H 2 0 and 
quantitated by spectroph tometry at OD 260 * Approximately 
half of the RNA samples were then treated with DNAse I to 
remove contaminating chromosomal DNA. Each 50 m! RNA 
5 sample (50 nq) , 5.7 /il 10x PCR buffer (Perkin- 

Elmer/Cetus) and 1 fil RNAse inhibitor (40 units/Ml; 
Boehringer Mannheim, Germany) were mixed together. Two 
microliters of DNAse I (10 units/^1; Boehringer Mannheim) 
was added to the reaction which was incubated for 3 0 min. 

10 at 37°C. The total volume was brought to 200 /xl with 
DEPC H 2 0, extracted once with phenol /chloroform and 
precipitated by adding 20 Ml 3M NaOAc, pH 4.8, (DEPC- 
treated) , 500 Ml absolute ETOH and incubated for 1 hour 
on dry ice. The precipitated sample was centrifuged for 

15 15 min., and the pellet was washed with 70% ETOH. The 
sample was re-centrif uged, the remaining liquid was 
aspirated, and the pellet was resuspended in 50 pi H 2 0. 
The concentration of RNA was measured by reading the 

OD 260- 

20 P^rst strand cDNA synthesis 

For each RNA sample, duplicate reverse 
transcription reactions were carried out in parallel. 
Pour hundred ng RNA plus DEPC H 2 0 in a total volume of 10 
Ml were added to 4 ja1 T X1 CC 3' primer (10 jiM; Operon) . 

25 The mixture was incubated at 70 D C for 5 min. to denature 
the RNA and then placed at room temperature. Twenty-six 
^1 of reaction mix containing the following components 
was added to each denatured RNA/primer sample: 8 pi 5x 
First Strand Buffer (Gibco/BRL, Gaithersburg, MD) , 4 pi 

30 0.1M DTT (Gibcol/BRL) , 2 jil RNAse inhibitor (40 units/Ml) 
(Boehringer Mannheim), 4 jil 200 mM d*TTP mix, 6 nl H 2 0, 2 
/il Superscript reverse transcriptase (200 units/Ml; 
Gibcol/BRL) . The reactions were mixed gently and 
incubat d for 30 min. at 42 °C. Sixty til of H 2 0, for a 
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final volum of 100 nl f was then added and the sampl s 
were denatured for 5 min. at 85 6 C and stored at -20*C~ 
PCR reactions 

The resulting single stranded cDNA molecules were 
5 then amplified by PCR. Specifically, 13 Ml of reaction 
mix was added to each tube of a 96 well plate on ice. 
The reaction mix contained 6.4 Ml H 2 0, 2 Ml 10* PCR 
Buffer (Perkin-Elmer) , 2 Ml 20 mM dNTFs, 0.4 |il 35 S dATP 
(12.5 MCi//il# 50 MCi total; Dupont/NEN) , 2 Ml 5' primer 

10 OPE4 ( 5 ' GTGACATGCC-3 9 J 10 MM; Operon) , and 0.2 Ml 

AmpliTaq" Polymerase (5 units/Ml? Perkin-Elmer) . Next, 2 
Ml of 3' primer (T^CC, 10 mM) were added to the side of 
each tube, followed by 5 Ml of cDNA, also to the sides of 
the tubes, which were still on ice. Tubes were capped 

15 and mixed, and brought up to 1000 rpm in a centrifuge, 
then immediately returned to ice. A Perkin-Elmer 9600 
thermal cycler was used, and programmed as follows: 



94°C 
*94°C 

20 *40°C 

*ramp 72°C 
*72°C 
72°C 
4°C 



2 min. 
15 sec. * * x 40 

2 min. 

1 min. 
30 sec. 

5 min. 
hold 



25 When the thermal cycler initially reached 94 °C, 

the 96 well plate was removed from ice and placed 
directly into the cycler- following the amplification 
reaction, 15 Ml of loading dye, containing 80% foarmamide, 
10 mM EDTA, 1 mg/ml xylene cyanole, 1 mg/ml bromphenol 

3 0 blue were added. The loading dye and reaction were 
mixed, incubated at 85°C for 5 min., cooled on ice, 
centrifuged, and placed on ice. Approximately 4 *il from 
each tube was load d onto a pre-run (60V) 6% denaturing 
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acrylamide gel- The gel was run at approximately 80V 
until top dye front was about 1 inch from bott m. The 
gel was transferred to 3 MM paper (Whatman Paper, England) 
and dried under vacuum. Bands were visualized by 
5 autoradiography. 

6.1.3. OTHER TSCByioygg 
Amplified cDNA Ban d Isolation and Amplif lcatiop 

PCR bands determined to be of interest in the 
differential display analysis were recovered from the gel 

10 and reamplif ied. 

Briefly, differentially expressed bands were 
excised from the dried gel with a razor blade and placed 
into a microfuge tube with 100 ^1 H 2 0 and heated at 100°C 
for 5 min. , vortexed r heated again to 100°C for 5 min. , 

15 and vortexed again. After cooling, 100 /xl H 2 0, 20 pi 3M 
NaOAc , 1 fil glycogen (20 mg/ml) f and 500 m! ethanol were 
added and the sample was precipitated on dry ice. After 
centrif ugation, the pellet was washed and resuspended in 
10 Ml H 2 0. 

20 DNA isolated from the excised differentially 

expressed bands were then reamplified by PCR using the 
following reaction conditions: 



58 


Ml 


H 2 0 


10 


Ml 


lOx PCR Buffer (see above) 


10 


Ml 


200 jiM dNTPs 


10 


Ml 


10 mM 3' primer (see above) 


10 


Ml 


10 mM 5' primer (see above) 


1.5 


Ml 


amplified band 


0.5 


Ml 


AMPLITAQ® polymerase (5 units Jul; 



30 (Perkin Elmer) 

PCR conditions were the same as the initial 
conditions used to generate the original amplifi d band, 
as describ d, abov . After reamplif ication, glycerol 
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loading dyes were added and samples were loaded nto a 2% 
preparative TAE/Biogel (BiolOl, La Jolla, CA) agarose gel 
and eluted. Bands were then excised from the gel with a 
razor blade and vortexed for 15 min. at r.t. , and 
5 purified using the MERMAID- kit from BiolOl by adding 3 
volumes of MERMAID- high salt binding solution and 8 Ml 
of resuspended glassfog in a microfuge tube. Glassfog 
was then pelleted, washed 3 times with ethanol wash 
solution, and then DMA was eluted twice in 10 m! at 50 «C. 
10 Bu bBlon^ ng and B^mieneina 

The TA cloning kit (Invitrogen, San Diego, CA) was 
used to subclone the amplified bands. The ligation 
reaction typically consisted of 4 Ml sterile H 2 0, l Ml 
ligation buffer, 2 Ml TA cloning vector, 2 Ml PCR 
15 product, and l Ml T4 DNA ligase. The volume of PCR 
product can vary, but the total volume of PCR product 
plus H 2 0 was always 6 Ml- Ligations (including vector 
alone) were incubated overnight at 12 'C before bacterial 
transformation. TA cloning kit competent bacteria 
20 (INVaF' : endal . LSShl, hssiBlZ(r-k, m+k) , sheE4±, X- , 
thi-1 . ovrA . 08OlacZaAM15A( 1qcm-arqF ) , deoR+, 

F') were thawed on ice and 2 Ml of °- 5 M 0~ 
mercaptoethanol were added to each tube. Two Ml from 
each ligation were added to each tube of competent cells 
25 (50 mD/ mixed without vortexing, and incubated on ice 
for 30 min. Tubes were then placed in 42 -C bath for 
exactly 30 sec, before being returned to ice for 2 min. 
Four hundred-fifty Ml of SOC media (Sambrook sS.aLt, 
1989, sacra) were then added to each tube which were then 
30 shaken at 37-C for l hr. Bacteria were then pelleted, 
resuspended in approximately 200 Ml SOC and plated on 
Luria broth agar plates containing x-gal and 60 ng/nl 
ampicillin and incubated overnight at 37 •C. White 
colonies were then pick d and screened for inserts using 
35 PCR. 
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A master mix containing 2 Ml 10 x PCR buff r, 1.6 
Ml 2.5 mM dNTP's, 0.1 Ml 25 mM MgCl 2 , 0.2 Ml M13 r verse 
primer (100 ng/Ml) , 0.2 m! M13 forward primer (100 
ng/Ml) , 0.1 Ml AmpliTaq® (Perkin-Elmer) , and 15.8 m! H 2 0 
5 was made. Forty m1 of the master mix were aliquoted into 
tubes of a 96 well plate, and whole bacteria were added 
with a pipette tip prior to PCR. The thermal cycler was 
programmed for insert screening as follows: 



94 °C 2 min. 

10 *94°C 15 sec. * = x 35 

*47°C 2 min. 

*ramp 72 °C 30 sec. 

*72°C 30 sec. 

72°C 10 min. 

15 4°C hold 



Reaction products were eluted on a 2% agarose gel 
and compared to vector control. Colonies with vectors 
containing inserts were purified by streaking onto LB/ Amp 
plates. Vectors were isolated from such strains and 

20 subjected to sequence analysis, using an Applied 

Biosystems Automated Sequencer (Applied Biosystems, Inc. 
Seattle, WA) . 
Cloflipq of ffqmaq G«i»« 

A human retina cDNA library obtained from Clontech 

25 was screened using the entire mouse fomy030 cDNA (FXGS. 
3A and 3B) as a probe. During this screen, one million 
library phage were screened, 53 of which were found to 
hybridize with the mouse fomy030 probe. The cDNA inserts 
for eight of these positives were isolated, subcloned, 

3 0 and sequenced. 

Comparison of the murine fomy030 and human fohy03 0 
sequences demonstrated a high degree of sequence 
similarity (86% identical at the nucleotide level and 
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94.4% identical at the amino acid level) within th 5', 
1813 base pairs of their cDNAs. However, beyond this 
point the sequences diverge and share no significant 
similarity. The sequence of fomy030 at the point of 
5 divergence is GTAG , which corresponds to a consensus 
splice donor site* 

Three independent library isolated cDNAs, as well 
as a cDNA isolated as a 3' RACE product were found to 
contain the fomy03 0 sequence. Thus, the most probable 

10 explanation for the divergence of the human and murine 
sequences is the existence of alternate splice forms of 
the fomy030 and fohy030 transcripts. The fomy030 splice 
version results in a protein product of 542 amino acids 
in length, while the fohy030 splice variant is predicted 

15 to encode a protein of 1497 amino acids in length (FIG. 
5). 

Another splice variant is shown in FIG. 6 (SEQ ID 
HO: 8), and encodes a protein of 1533 amino acids in 
length (SEQ ID N0:9). The cDNA of FIG. 5 (SEQ ID NO:6) 

20 is missing 34 nucleotides beginning after 2879 in SEQ ID 
NO; 8, and is missing 74 nucleotides beginning after 2926 
in SEQ ID NO: 8. Thus, nucleotides 288 0-2892 in SEQ ID 
NO: 6 are identical to nucleotides 2914-2926 in SEQ ID 
NO: 8, and the sequences are essentially identical 

25 starting at 2893 in SEQ ID NO:6 and 3001 in SEQ ID N0:8. 
The difference in the respective amino acid sequences is 
that the amino acids are identical from 1 to 844 , and 
then again from 850 to 1497 in SEQ ID NO: 7 and from 886 
to 1533 in SEQ ID NO: 9. 

30 Within their common 5' sequences, fohy030 was also 

found to have an additional three base pairs (GGA) 
inserted after position 1394 in the mouse cDNA (at 
positions 1066-1068 in FIGS. 5 and 6). These additional 
three base pairs fall within the open r ading frames of 

35 both fohy030 and fomy030, and result in an additional 
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Glycin residue at: position 356 within th open reading 
fram of fohy030 relative to fomy030. 
Northern Analysis 

Northern analysis was performed to confirm the 
5 differential expression of the genes corresponding to the 
amplified bands, as described below. 

Twelve micrograms of total RNA sample , 1.5 x RNA 
loading dyes (60% formamide, 9% formaldehyde, 1.5 x MOPS, 
.075% x C/BPB dyes) at a final concentration of 1 x and 
10 H 2 0 to a final volume of 40 /il were mixed. The tubes 
were heated at 65 °C for 5 min. and then cooled on ice. 
The RNA samples analyzed were loaded onto a denaturing 1% 
agarose gel. The gel was run overnight at 3 2V in 1 x 
MOPS buffer. 

15 A 300 ml denaturing 1% agarose gel was made as 

follows. Three grams of agarose (SeaKem 111 LE, FMC 
BioProducts, Rockland, ME) and 60 ml of 5 x MOPS buffer 
(0.1M MOPS [pH 7.0], 40 mM NaOAc, 5mM EDTA [pH 8.0]) were 
added to 210 ml sterile H 2 0. The mixture was heated 

20 until melted, then cooled to 50 °C, at which time 5 ^1 

ethidium bromide (5 mg/ml) and 30 ml of 37% formaldehyde 
were added to the melted gel mixture. The gel was 
swirled quickly to mix, and then poured immediately. 

After electrophoresis, the gel was photographed 

25 with a fluorescent ruler, then was washed three times in 
DEPC H 2 0, for 20 minutes per wash, at room temperature, 
with shaking. The RNA was then transferred from the gel 
to Hybond-N» membrane (Amersham) , according to the 
methods of Sambrook et al. , 1989, siiEEa, in 20 x SSC 

30 overnight. 

The probes used to detect mRNA were typically 
synthesized as follows: 2 Ml amplified cDNA band (-30 
ng) , 7 Ml H 2 0, and 2 Ml 10 x Hexanucleotide mix 
(Boehringer-Mannheim) were mixed and heated to 95°C for 5 

35 min., and then allowed to cool on ic . The volume of the 
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amplified band can vary, but the total volume of the band 
plus H 2 0 was always 9 fil. 3 m! dATP / dGTP / dTTP mix (1:1:1 
of 0.5 mM each), 5 pi a 32 P dCTP 3000 Ci/mM (50 MCi total; 
Amersham, Arlington Heights, IL) , and 1 ^1 Klenow (2 
5 units; Boehringer-Mannheim) were mixed and incubated at 
37°C. After 1 hr. r 30 pi TE were added and the reaction 
was loaded onto a Biospin-6 B ' column (Biorad, Hercules, 
CA) , and centrifuged. A 1 Ml aliquot of eluate was used 
to measure incorporation in a scintillation counter with 

10 scintillant to ensure that 10 6 cpm/jil of incorporation 
was achieved. 

For pre-hybridization, the blot was placed into a 
roller bottle containing 10 ml of rapid-hyb solution 
(Amersham) , and placed into 65 °C incubator for at least 1 

15 hr. For hybridization, 1 x 10 7 cpm of the probe was then 
heated to 95°C, chilled on ice, and added to 10 ml of 
rapid-hyb solution. The prehybridization solution was 
then replaced with probe solution and incubated for 16 
hours at 65°C. The following day, the blot was washed 

20 once for 20 min. at room temperature in 2 x SSC/O.1% SDS 
and twice for 15 min. at 65°C in 0.1 x SSC/0.1% SDS 
before being covered in plastic wrap and put down for 
exposure • 

Tn situ Hybridization 

25 10 fim sections of formalin fixed/paraffin embedded 

benign nevi (non-metastic growths of melanocytes) and 
malignant melanoma were post-fixed with 4% PFA/PBS for 15 
minutes. After washing with PBS, sections were digested 
with 21 Atg/ml proteinase K at 37 °C for 15 minutes, and 

30 again incubated with 4% PFA/PBS for 10 minutes* Sections 
were then washed with PBS, incubated with 0.2 N HC1 for 
10 minutes, washed with PBS, incubated with 0.25% acetic 
anhydride/ 1 M triethanolamine for 10 minutes, washed with 
PBS, and dehydrated with 70% ethanol and 100% etlianol. 
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Hybridizations were performed with 35 S- 
radiolabeled (5 x 10 7 cpm/ml) cRNA probes encoding a 1.1 
XB segment of the coding region of the human cDNA (clone 
fohy030), and a 1 kB segment of the coding region of the 
5 human H4 histone gene in the presence of 50% formamide, 
10% dextran sulfate, 1 x Denhardt's solution, 600 mM 
NaCl, 10 mM DTT, 0.25% SDS, and 100 /*g/ml tRNA for 18 
hours at 55 °C, The H4 histone gene was used as a control 
to show proper transcription of KNA. 

10 After hybridization, slides were washed with 5 x 

SSC at 55°C, 50% formamide/2 x SSC at 55°C for 30 
minutes, 10 mM Tris-HCl(pH 7.6) /500 mM NaCl/1 mM EDTA 
(TNE) at 37 °C for 10 minutes, washed in TNE at 37°C for 
10 minutes, incubated once in 2 x SSC at 50°C for 3 0 

15 minutes, twice in 0.2 x SSC at 50°C for 30 minutes, and 
dehydrated with 70% ethanol and 100% ethanol. 
Localization of mRNA transcripts was detected by dipping 
slides in Kodak NBT-2 photo-emulsion and exposing for 4 
days at 4°C Controls for the in situ hybridization 

20 experiments included the use of a sense probe which 
showed no signal above backgrounds levels. 

6.2. RESULTS 
An in vitro paradigm, as described, above, in 
Section 5.1.1.1, was carried out using the melanoma cell 
25 lines, B16 Fl and B16 F10. The B16 Fl cell line exhibits 
a low metastatic potential, while the B16 F10 cell line 
exhibits a high metastatic potential. Thus, the two cell 
lines were grown in vitro as described in Section 6,1.1, 
RNA was isolated from these cells and differential 
30 display carried out as described in Section 6.1. 

The differential display analysis identified a 
band, designated romy030, which represents a cDNA derived 
from RNA produced by a gene which was expressed at a much 
higher lev 1 in the B16 Fl cells, i.e. . the low 
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metastatic potential cells, relative to th gene's 
expression in B16 F10 cells, i.e., high metastatic 
potential cells. The gene corresponding to the romy030 
band is referred to herein as the fomy030 or 030 gene. 
5 The amplified romy030 band was isolated, 

reamplified, subcloned, and sequenced, as described, 
above, in Section 6.1.3. The romy03 0 nucleotide sequence 
(SEQ ID NO:l) is shown in FIG. 2. 

A BLAST (Altschul, S.F. et al., 1990, J. Mol. 
10 Biol. 2i£: 403-410) database search with the romy030 
nucleotide sequence revealed no sequences within the 
database which are similar to that of romy030. Thus, 
030, the gene corresponding to romy030, appears to 
represent a novel, previously unknown gene which is 
15 differentially expressed in cells exhibiting a low 

metastatic potential relative to those cells exhibiting a 
high metastatic potential. 

To confirm this putative differential regulation, 
amplified romy030 cDNA was used to probe Northern RNA 
20 blots containing RNA from B16 Fl and B16 F10 cells. FIG. 
1 shows the results of one such Northern blot analysis, 
in which it is demonstrated that the steady state levels 
of fomy03 0 mRNA are significantly higher in the low 
metastatic potential cells (i.e. . the B16 Fl cells) 
25 relative to the high metastatic potential cells ( i f e> f 
B16 F10 cells) . Lanes 1 and 3 represent Fl cells and 
Lanes 2 and 4 represent F10 cells respectively. Thus, 
this Northern analysis confirmed the putative 
differential fomy030 regulation which had been suggested 
3 0 by the differential display results. 

Two specific oligonucleotides were generated based 
on the sequence of romy030, romy030U 5'- 

GGGGAAGCACATCAAGGAAC-3 ' (SEQ ID N0:4) and romy030L 5'- 
GCAACTACACTCGGAAAAGC-3 ' (SEQ ID NO: 5), for use in PGR 
35 reactions. cDNA librari s prepared from mRNA isolat d 
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? 

from normal melanocytes and a mouse melanoma cell line 
were analyzed for the presence of fomy030 by PCR, 
utilizing the above romy030 probes. Fomy030 was detected 
in the melanocyte library but not in the melanoma 
5 library. The melanoma library was generated from a 

highly metastatic mouse melanoma K-1735 m2. This result 
is consistent the observation that fomy030 is present at 
reduced levels in the metastatic B16 F10 melanoma cell 
line. A radioactive DNA probe was generated from the 

10 subcloned romy030 DNA. This probe was used to screen the 
normal mouse melanocyte cDNA library. Three independent 
positive clones were identified and isolated during this 
screening. These clones were designated fomy030a, 
fomy030b, and fomy030c. These cDNAs were sequenced and 

15 the overlapping portions were found to be identical. The 
nucleotide sequence of all three fomy030 cDNAs, 
designated as the fomy030 sequence (SEQ ID NO: 2) is 
depicted in FIGS. 3A and 3B, and contains the sequence of 
romy030. The findings described herein suggest a novel 

20 role for fomy030 in tumor progression. A down-regulation 
of 030 can be used as a diagnostic marker for tumor 
progression , especially for the progression to 
metastasis. Further, 030 gene products can be used in 
the prevention and treatment of tumor progression 

25 disorders. 

Fohv030 expression in Human Tissue Samp lag 

To determine whether the fohy030 gene product: is 
differentially expressed in clinically relevant human 
disease, fohy030 gene expression was analyzed in biopsy 

30 sections of human benign nevi (non-metastic growths of 
melanocytes) and malignant melanoma using in situ 
hybridization. Fohy030 expression was detected in small 
intermittent cells in the basal layer of the epidermis 
(likely, melanocytes) and in the majority of nevus cells 

35 in patients diagnosed with benign n vi. No expression of 
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fohy030 was d tect d in the majority of m la noma cells in 
patients diagnosed with metastatic melanoma, though 
expression was detected in normal melanocytic cells in 
the same tissue section. These results show that the 
5 f ohy030 gene product is associated with metastasis 
suppression. 

6.3. 030 GENE EXPRESSION IS INVERSELY 

CORRELATE D WTTH METASTATIC POTENTIAL 

6.3.1. EXPERIMEN T All PROTOgOT.8 MID RESULTS 

!0 The relationship between 030 gene expression and 

tumor progression was confirmed as described herein. 

Specifically, the metastatic potentials of six variants 

of the B16 cell line were tested in animals and the 

metastatic potential was compared to the level of 030 

15 gene expression observed within the cell variants. 

A single cell suspension of B16 Fl cells (low 

metastatic potential) was injected intravenously into 

syngeneic C57BL/6 mice. After three weeks , lung tumors 

were excised and seeded into tissue culture. The 

20 following six cell lines were grown in culture: B16 Gl, 

B16 G2, B16 G3, B16 G4 , B16 G9 and B16 G12. 

To test the metastatic ability of the above listed 

six tumor cell lines, the same number of cells for each 

of the six cell lines intravenously into different groups 

25 of syngeneic C57BL/6 mice. Three weeks later, the mice 

were killed and the lungs were removed aseptically. 

Significantly more number of tumors were observed in mice 

injected with the following three cell lines: B16 G4 f 

B16 G9 and B16 G12. These results demonstrate that the 

30 B16 G4, B16 G9 and B16 G12 cell lines have high 

metastatic potential and the B16 Gl, B16 G2 and B16 G3 

cell lines have low metastatic potential. 

The lung tumors produced from these three highly 

m tastatic cell lines (B16 G4 , B16 G9 and B16 G12) were 

35 then excised and seeded into tissue culture to produce 
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the following four cell lines: B16 H5, B16,H6, B16 H7 and 
B16 H8. 

Northern analysis was performed to determine the 
expression of 030 gene in the above listed cell lines 
5 f i.e. , B16 H5, B16,H6, B16 H7 and B16 H8) using 

procedures described above in Section 6.1.3. Figure 4 
shows the results of one such Northern blot analysis, in 
which it is demonstrated that the steady state levels of 
030 mRNA are significantly lower in the highly metastatic 
10 cells ( i.e. . B16 H5, B16,H6, B16 H7 and B16 H8) relative 
to the B16 Fl low metastatic potential cells. Lane 1 
represents the B16 Fl cells, lane 2 is B16 F10 metastatic 
cells and lanes 3-6 represent B16 H5, B16,H6 r B16 H7 and 
B16 H8. 

15 Thus, this Northern analysis confirmed the initial 

finding in this invention that 030 expression is 
inversely related to the metastatic potential of tumor 
cells and supports the theory that the 030 gene product 
plays a role in inhibiting tumor progression, including 

2 0 the progression to a high metastatic potential state. In 
this regard, it is important to note that the tumor cell 
number and homogeneity, and the syngeneic recipient did 
not change from one cell line to another in the above 
protocols. Therefore, the differences in metastatic 

25 incidence can only be attributed to properties intrinsic 
to the various cell lines used. The clonal selection of 
tumors from successive metastases results in cells better 
capable of survival, formation and progression of tumor 
foci in the lung. This indicates that the decrease in 

30 expression of 030 observed in the highly metastatic four 
cell lines ( i.e. . B16 H5, B16,H6, B16 H7 and B16 HS) is 
an intrinsic property of these cell lines and is related 
to the development, progression and metastatic potential 
of the tumor cells. 
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7. EXAMPLE: USE OF FINGERPRINT GENES 

Afl SURROGATE MARKERS IN CLINICAL TRTAEfl 

The expression pattern of the fingerprint genes of 

the invention may be utilized as surrogate markers to 

5 monitor clinical human trials of drugs being tested for 

their efficacy as tumor progression treatments, or may, 

additionally, be used to monitor patients undergoing 

clinical evaluation for the treatment of tumor 

progression. "Fingerprint gene," as used herein is 

10 defined as in Section 3, above* Individual fingerprint 
gene expression patterns may be analyzed or, 
alternatively, fingerprint patterns may be analyzed. 
"Fingerprint pattern, n as used herein is defined as in 
Section 3 , above . 

15 The effect of the compound on the fingerprint gene 

expression normally displayed in connection with a 
disorder involving tumor progression can be used to 
evaluate the efficacy of the compound as a treatment for 
such a disorder. Additionally, fingerprint gene 

20 expression can be used to monitor patients undergoing 
clinical evaluation for the treatment of the disorder. 

According to the invention, the fingerprint gene 
expression and fingerprint pattern derived from any of 
the paradigms described in Section 5.1.1.1 can be used to 

25 monitor clinical trials of drugs in human patients. The 
paradigms described in Section 5.1.1.1, and illustrated 
in the Example presented in Section 6, above, for 
example, provide the fingerprint pattern of B16 melanoma 
cells. This profile gives an indicative reading, 

30 therefor, of the metastatic and non-metastatic states of 
melanoma cells. Accordingly, the influence of anticancer 
chemotherapeutic agents on the melanoma cells can be 
measured by performing differential display on melanoma 
cells of patients undergoing clinical tests. 
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7.1. TREATMENT OF PATIENTS AND PROCUREMENT 
OF TUMOR CELLS OR BIOPSIES 

Test patients can be administered compounds 
suspected of antimetastatic activity. Control patients 
5 can be given a placebo. 

Tumor cells or biopsies can be drawn from each 
patient after a determined period of treatment and RNA 
can be isolated as described in Section 6.6.1, above. 

7.2. ANALYSIS OF SAMPLES 
10 RNA can be subjected to differential display 

analysis as described in Section 6.6.1, above. A 
decrease in the metastatic potential of tumor cells is 
indicated by an increase in the intensity of the romy030 
band, as described in Section 6.2, above. 

15 8. DEPOSIT OF MICROORGANISMS 

The following microorganism was deposited with the 
Agricultural Research Service Culture Collection (NRRL) , 
Peoria, Illinois, on March 3, 1995 and assigned the 
indicated accession number: 

2 0 M i croorqan i am NRRL Access ion No. 

E. coli B-21416 

OTHER EMBODIMENTS 
The present invention is not to be limited in 
scope by the specific embodiments described which are 

25 intended as single illustrations of individual aspects of 
the invention and functionally equivalent methods and 
components are within the scope of the invention, in 
addition to those shown and described herein will become 
apparent to those skilled in the art from the foregoing 

30 description and accompanying drawings. Such 

modifications are intended to fall within the scope of 
the append d claims. 
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SEQUENCE LISTING 



(1) GENERAL INFORMATION t 

(1) APPLICANTS Shy j an, Andrew W. 

(ii) TITLE OP INVENTION: COMPOSITIONS FOR THE DIAGNOSIS f PREVENTION 

AND TREATMENT OF TUMOR PROGRESSION 

(ill) NUMBER OF SEQUENCES : 9 

(iv) CORRESPONDENCE ADDRESS t 

(A) ADDRESSEE : Fish 6 Richardson P.C. 

(B) STREET: 225 Franklin Strett 

(C) CITY: Boston 

(D) STATE: MA 

(E) COUNTRY: USA 

(F) ZIP: 02110-2804 

(v) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC-DOS /MS-DOS 

(D) SOFTWARE: Patent In Release /1»0, Version #1.30 

(vi) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: US 

(B) FILING DATE: 2 9 -MAR- 1996 

(C) CLASSIFICATION: 

(vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: US 08/412,431 

(B) FILING DATE: 29-MAR-1995 

(vlii) ATTORNEY /AGENT INFORMATION: 

(A) NAME: Faase, J. Peter 

(B) REGISTRATION NUMBER: 32,983 

(C) REFERENCE /DOCKET NUMBER: 07334/004001 

(ix) TELECOMMUNICATION INFORMATION: 

(A) TELEPHONES 617/542-5070 

(B) TELEFAX: 617/542-8906 

(C) TELEX: 200154 

(2) INFORMATION FOR SEQ ID NO: 1: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 186 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:l: 

GGTGCTGGAG TACCTCATGG GCGGTGCCTA CCGCTGCAAC TACACTCGGA AAAGCTTCCG 60 

GACTCTCTAC AACAACTTCT TTGGCCCTAA GACGGTAGAG CTCAGCAGAC ACACAGTCTC 120 

CTGTGCCTCC CAGAGTAACA TGTGGTTCCT TGATGTGCTT CCCCAAAAGC CCACCTCTGC 180 



AGAATG 



186 
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(2) INFORMATION FOR SEQ ID NO: 2: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2729 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 

AAGGAGGCTA GGCTGCACCC TTCCCG CTTG CTCAGCAGCT GAGGCAGGGT CAGAAAGCAT 60 

GGATAGAGAA CACATTTTGC AAAAGGCAAT GCATCTTT G T AATTCCCAGT ACAAAAGACC 120 

CTAACAGATG TTGCTGTGGT CAG CT C ACT A ACCAGCACAT CCCCCCTTTG CCGAGTGGGG 180 

CTCCCAGCAC AACAGGAGAG GACACCAAGC AGGCAGACAC GCAGTCCGGG AAATGGTCTG 240 

TCAGCAAACA CACCCAGAGC TACCCAACAG ACTCCTATGG GATTCTTGAA TTCCAGGGTG 300 

GGGGTTACTC CAATAAAGCC ATG TAC ATC CGA GTC TCC TAC GAC ACC AAG 350 

Met Tyr lie Arg Val Ser Tyr Asp Thr Lye 

15 10 

CCA GAT TCC CTG CTC CAC CTC ATG GTG AAG GAC TGG CAG CTG GAG CTC 398 
Pro Asp Ser Leu Leu His Leu Met Val Lye Aep Trp Gin Leu Glu Leu 
15 20 25 

CCG AAG CTC TTG ATA TCT GTG CAC GGA GGC CTC CAA AGC TTC GAG ATG 446 
Pro Lye Leu Leu He Ser Val His Gly Gly Leu Gin Ser Phe Glu Met 
30 35 40 

CAG TCC AAA CTG AAG CAG GTG TTT GGG AAA GGT CTG ATC AAG GCT GCC 494 
Gin Ser Lys Leu Lys Gin Val Phe Gly Lys Gly Leu He Lys Ala Ala 
45 50 55 

ATG ACC ACG GGG GCG TGG ATC TTC ACC GGG GCT GTG AGC ACT GGT GTC 542 
Met Thr Thr Gly Ala Trp He Phe Thr Gly Gly Val Ser Thr Gly Val 
60 65 70 

GTC AGC CAT GTG GGG GAT GCC TTG AAA GAC CAC TCC TCC AAG TCC AGA 590 
Val Ser His Val Gly Asp Ala Leu Lys Asp His Ser Ser Lys Ser Arg 
75 . 80 85 90 

GGC CGG CTC TGT GCT ATA GGA ATT GCT CCC TGG GGC ATG GTG GAG AAC 638 
Gly Arg Leu Cys Ala He Gly He Ala Pro Trp Gly Met Val Glu Aen 
95 100 105 

AAG GAA GAC CTG ATT GGA AAA GAT GTA AGA AGA GTC TAT CAG ACC ATG 686 
Lys Glu Asp Leu He Gly Lys Asp Val Thr Arg Val Tyr Gin Thr Met 
110 115 120 

TCC AAC CCT CTG AGC AAG CTC TCT GTG CTC AAC AAT TCC CAC ACT CAC 734 
ser Asn Pro Leu Ser Lys Leu Ser Val Leu Asn Asn Ser His Thr His 
125 130 135 

TTC ATC TTG GCT GAC AAC GGC ACC CTG GGC AAG TAT GGT GCT GAG GTG 782 
Phe He Leu Ala Asp Asn Gly Thr Leu Gly Lys Tyr Gly Ala Glu Val 
140 145 150 

AAG CTT CGA AGA CAG CTG GAA AAA CAC ATC TCC CTG CAG AAG ATC AAC 830 
Lys Leu Arg Arg Gin Leu Glu Lys His II S r Leu Gin Lys He Asn 
155 160 165 170 
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ACA AGG CTG GGC CAG GGT GTA CCT GTC GTG GGC CTA GTG GTA GAA GGT 878 

Thr Arg Leu Gly Gin Gly Val Pro Val Val Gly L u Val Val Glu Gly 

175 180 185 

GGT CCT AAC GTG GTT TCT ATC GTC CTG GAG TAT CTC AAA GAA GAC CCT 926 
Gly Pro Aan Val Val Ser lie Val Leu Glu Tyr Leu Lye Glu Aep Pro 
190 195 200 

CCT GTC CCT GTG GTG GTT TGC GAT GGC ACT GGA CGT GCC TCT GAC ATT 974 
Pro Val Pro Val Val Val Cye Asp Gly Ser Gly Arg Ala Ser Aep lie? 
205 210 215 

TTG TCC TTC GGA CAC AAA TAC TGC GAC GAA GGA GGA GTC ATA AAC GAG 1022 
Leu Ser Phe Ala Hie Lye Tyr eye Aap Glu Gly Gly Val lie Aen Glu 
220 225 230 

TCC CTG CGG GAC CAG CTT CTA GTT ACC ATT CAG AAA ACA TTT AAT TAC 1070 
Ser Leu Arg Aep Gin Leu Leu Val Thr lie Gin Lye Thr Phe Aen Tyr 
235 240 245 2SO 

AGC AAG TCC CAG TCG TAT CAG CTG TTT GCA ATT ATC ATG GAG TGC ATG 1118 
Ser Lye Ser Gin Ser Tyr Gin Leu Phe Ala lie He Met Glu Cys Met 
255 260 265 



AAG AAG AAA GAA CTC GTC ACT GTG TTT CGG ATG GGT TCC GAG GGT CAG 
Lye Ly© Lye Glu Leu Val Thr Val Phe Arg Met Gly Ser Glu Gly Glix 
270 275 280 



1166 



CAA GAT GTC GAG ATG GCA ATT TTA ACT GCC TTC CTC AAA GGA ACC AAC 1214 
Gin Aep Val Glu Met Ala He Leu Thr Ala Leu Leu Lye Gly Thr Aen 
285 290 295 

GCA TCA GCT CCA GAT CAG CTG AGC TTG GCC CTG GCT TGG AAC CGG GTC 1262 
Ala Ser Ala Pro Asp Gin Leu Ser Leu Ala Leu Ala Trp Aen Arg VaL 
300 305 310 

GAC ATA GCG CGA AGC CAG ATC TTC GTC TTT GGC CCA CAC TGG CCG CCA 1310 
Aep He Ala Arg Ser Gin He Phe Val Phe Gly Pro Hie Trp Pro Pro 
315 320 325 33D 

CTG GGA AGC CTG GCC CCT CCT GTG GAC ACC AAA GCC GCA GAG AAG GAA. 1358 
Leu Gly Ser Leu Ala Pro Pro Val Aap Thr Lye Ala Ala Glu Lye Glu 
335 340 345 

AAG AAG CCA CCC ACA GCC ACC ACC AAG GGG AGA GGA AAA GGA AAA GGC 1406 
Lye Lye Pro Pro Thr Ala Thr Thr Lye Gly Arg Gly Lye Gly Lye Gly 
350 355 360 

AAG AAG AAA GGC AAA GTG AAA GAG CAA GTG GAG GAA GAG ACG GAC CCC 1454 
Lye Lya Lye Gly Lye Val Lye Glu Glu Val Glu Clu Glu Thr Aep Pro 
365 370 375 

CGG AAG CTT GAG CTG CTC AAC TGG GTG AAT GCC CTG GAG CAA GCC ATQ 1502 
Ara Lye Leu Glu Leu Leu Aen Trp Val Aen Ala Leu Glu Gin Ala Met. 
380 385 390 

CTG CAT GCT CTT GTC CTA GAT CGG GTG GAC TTT GTA AAG CTC CTG ATT 1550 
Leu Aep Ala Leu Val Leu Aep Arg Val Aep Phe Val Lye Leu Leu lie* 
395 400 405 410 

GAA AAC GGA GTG AAC ATG CAG CAT TTC CTC ACC ATC CCG AGG CTG GAGS 1598 
Glu Aen Gly Val Aen Met Gin Hie Phe Leu Thr He Pro Arg Leu GU* 
415 420 425 
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GAG CTA TAC AAC ACC AGA CTG GGC CCA CCA AAC ACC CTT CAT CTG CTG 1646 

Glu Lou Tyr Ann Thr Arg Lou Gly Pro Pro Aon Thr L u Hio Lou Leu 
430 435 440 

GTG COG GAT GTA AAG AAG AGC AAC CTT CCA CCT GAT TAC CAC ATC ACC 1694 
Val Arg Aop Val Lyo Lyo Ser Aon Lou Pro Pro Asp Tyr Hie lie Sor 
445 450 455 

CTC ATT GAT ATA GGA CTG CTG CTG GAG TAC CTC ATG GGC GGT GCC TAC 1742 
Leu lie Asp lie Gly Leu Val Leu Glu Tyr Leu Met Gly Gly Ala Tyr 
460 465 470 

CGC TGC AAC TAC ACT CGG AAA AGC TTC CGG ACT CTC TAC AAC AAC TTG 1790 
Arg Cye Aon Tyr Thr Arg Lya Ser Phe Arg Thr Lou Tyr Aon Aon Lou 
475 480 485 490 

TTT GGC CCT AAG AGG GTA GAG CTC AGC AGA CAC ACA GTG TCC TGT GCC 1838 
Phe Gly Pro Lys Arg Val Glu Leu Ser Arg His Thr Val Ser Cyo Ala 
495 500 505 

TCC CAG AGT AAC ATG TGG TTC CTT GAT GTG CTT CCC CAA AAG CCC ACC 1886 
Ser Gin Ser Aon Met Trp Phe Leu Aep Val Leu Pro Gin Lyo Pro Thr 
510 515 520 

TGT GCA GAA TGC AAC TCT TCA CCT CAC CTG TCC CAA ACT GAC ATC ACC 1934 
Cye Ala Glu Cyo Aen Ser Ser Pro Hio Leu Ser Gin Thr Aep lie Thr 
525 530 535 

CCA CCT CTG CCC T GACACCCACT GCAGGGCCTC CTAGCTTTCA CATGCAGCCA 1987 
Pro Pro Leu Pro 
540 

TTCACATCGC CTCTCAAGAC 
CCTCCTTCCT ACAATAGCCC 
CCTCCCCACT GATCACTGGC 
CTT T TTG CCC ACAAAAGTTT 
ATCGGAGTCA TCCAGGCCAG 
CTATTCCTCC ATCAAAATCA 
AGCCTGTTAA ACCTCCAATA 
CAGTTTGTAC AGAGAGCGAG 
GTCTGTCCTT CCCAACTGAC 
CACCTGGATG TTGTCATTAA 
CTTTGTTATG CAGCCGAAAG 
GTGGCAGTTT ATTTACTATA 
TTCCATGAAT AAAGCTCGTG 



(2) INFORMATION FOR 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 542 amino acids 



TGGGCCAGGC 


AGTGCAACCT 


GTCAAGCATG 


TCTGTCCTCC 


2047 


CCCCTCTGGG 


CCCCATGCCT 


CTGCTCTCTC 


AGCCCGTTCT 


2107 


CCTCCTGTTG 


TCTTCCAAGG 


CAAGGAACAA 


GGAAAAGCAT 


2167 


AGGGCTCCCC 


G CTGTTC AAC 


CATAGCCAAC 


CTCACTGTAC 


2227 


CTGCCACACA 


CAAGCCTTCC 


CCACCCTATC 


CCAATAGACC 


2287 


AAGCTAACTC 


CTGGCCTGCC 


ACATTGCTTC 


TTCTTGCTCC 


2347 


AATGTCAGAT 


CTGTGGGAAG 


CCTTCCTCAC 


TCTCACTCCA 


2407 


AGCCTCGTTT 


GGTTCTACTT 


ACAAGGAAGG 


CTTTGTGTCT 


2467 


TTCTGTTGAC 


AGAAGCAGTT 


TCCACATGAA 


AGCGTTGACT 


2527 


TTAATAGTGA 


TACAAAATAT 


TGACACTTCT 


TTTCCTGCTT 


2587 


CACTTAAGCT 


TCTGGGAATG 


GAAGTAAGTA 


GGACATGTTT 


2647 


TATACCTTTG 


TCATTCTGTG 


GAAGCAAAAA 


TTGCAATGTT 


2707 


CC 








2729 



10 NO: 3: 
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(B) TYPE: amino acid 
(D) TOPOLOGY: lin ar 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 

Met Tyr lie Arg Val Ser Tyr Asp Thr Lys Pro Asp Ser Leu Leu Hie 
1 5 10 15 

Leu Met Val Lye Asp Trp Gin Leu Glu Leu Pro Lys Leu Leu lie Ser 
20 25 30 

Val His Gly Gly Leu Gin Ser Phe Glu Met Gin Ser Lys Leu Lys Gin 
35 40 45 

Val Phe Gly Lye Gly Leu lie Lys Ala Ala Met Thr Thr Gly Ala Trp 
50 55 60 

lie Phe Thr Gly Gly Val Ser Thr Gly Val Val Ser His Val Gly Asp 
65 70 75 SO 

Ala Leu Lys Asp His Ser Ser Lys Ser Arg Gly Arg Leu Cys Ala lie 
B5 90 95 

Gly lie Ala Pro Trp Gly Met Val Glu Asn Lys Glu Asp Leu He Gly 
100 105 HO 

Lvs Asp Val Thr Arg Val Tyr Gin Thr Mot Ser Asn Pro Leu Ser Lye 
1 us 120 125 

Leu Ser Val Leu Asn Asn Ser His Thr His Phe He Leu Ala Asp Asn 
130 135 1*0 

Gly Thr Leu Gly Lys Tyr Gly Ala Glu Val Lys Leu Arg Arg Gin Leu 
145 150 155 loo 

Glu Lys His He Ser Leu Gin Lys He Asn Thr Arg Leu Gly Gin Gly 
165 170 175 

Val Pro Val Val Gly Leu Val Val Glu Gly Gly Pro Asn Val Val Ser 
180 185 190 

He Val Leu Glu Tyr Leu Lys Glu Asp Pro Pro Val Pro Val Val Val 
195 200 205 

Cys Asp Gly Ser Gly Arg Ala Ser Asp He Leu Ser Phe Ala His Lya 
210 215 220 

Tyr Cys Asp Glu Gly Gly Val He Asn Glu Ser Leu Arg Asp Gin Leu 
225 230 235 240 

Leu val Thr He Gin Lys Thr Phe Asn Tyr Ser Lys Ser Gin Ser Tyr 
245 250 255 

Gin Leu Phe Ala He He Met Glu Cys Met Lys Lys Lys Glu Leu Val 
260 265 270 

Thr Val Phe Arg Met Gly Ser Glu Gly Gin Gin Asp Val Glu Met Ala 
275 280 285 

He Leu Thr Ala Leu Leu Lys Gly Thr Asn Ala Ser Ala Pro Asp Gin 
290 295 300 
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Leu Ser Leu Ala Leu Ala Trp Aan Arg Val Asp lie Ala Arg Ser Gin 
305 310 315 320 

lie Phe Val Phe Gly Pro His Trp Pro Pro Leu Gly Ser Leu Ala Pro 
325 330 335 

Pro Val Asp Thr Lys Ala Ala Glu Lye Glu Lys Lys Pro Pro Thr Ala 
340 345 350 

Thr Thr Lys Gly Arg Gly Lys Gly Lye Gly Lys Lys Lys Gly Lys Val 
355 360 365 

Lys Glu Glu Val Glu Glu Glu Thr Asp Pro Arg Lys Leu Glu Leu Leu 
370 375 380 

Asn Trp Val Aen Ala Leu Glu Gin Ala Met Leu Asp Ala Leu Val Leu 
385 390 395 400 

Asp Arg Val Asp Phe Val Lys Leu Leu lie Glu Asn Gly Val Asn Met 
405 410 415 

Gin His Phe Leu Thr lie Pro Arg Leu Glu Glu Leu Tyr Asn Thr Arg 
420 425 430 

Leu Gly Pro Pro Asn Thr Leu His Leu Leu Val Arg Asp Val Lys Lys 
435 440 445 

Ser Aen Leu Pro Pro Asp Tyr His lie Ser Leu lie Asp lie Gly Leu 
450 455 460 

r 

Val Leu Glu Tyr Leu Met Gly Gly Ala Tyr Arg Cys Asn Tyr Thr Arg 
465 470 475 480 

Lys Ser Phe Arg Thr Leu Tyr Asn Asn Leu Phe Gly Pro Lys Arg Val 
485 490 495 

Glu Leu Ser Arg His Thr Val Ser Cys Ala Ser Gin Ser Asn Met Trp 
500 505 510 

Phe Leu Asp Val Leu Pro Gin Lys Pro Thr Cys Ala Glu Cys Asn Ser 
515 520 525 

Ser Pro His Leu Ser Gin Thr Asp lie Thr Pro Pro Leu Pro 
530 535 540 

(2) INFORMATION FOR SEQ ID NO:4i 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 4: 
GGGGAAGCAC ATCAAGGAAC 20 
(2) INFORMATION FOR SEQ ID NO: 5: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE : nucleic acid 

(C) STRANDEDNESS: single 
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(D ) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 
GCAACTACTA CACTCGGAAA AGC 23 
(2) INFORMATION FOR SEQ ID NO: 6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4944 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDED NESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(ix) FEATURE: 

(A) NAME /KEY: CDS 

(B) LOCATION: 346.. 4837 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 

ACTCATTATA GGGNTCGAGC GGCCGCCCGG GCAGGTTTGA GCTGTGCCCT CTCCATTCCA 60 

CTGCTGTGGC AGGGTCAGAA ATCTTGGATA GAGAAAACCT TTTGCAAACG GGAATGTATC 120 

TTTGTAATTC CTAGCACGAA AGACTCTAAC AGGTGTTGCT GTGGCCAGTT CACCAACCAG 180 

CATATCCCCC CTCTGCCAAG TGCAACACCC AGCAAAAATG AAGAGGAAAG CAAACAGGTG 240 

GAGACTCAGC CTGAGAAATG GTCTGTTGCC AAGCACACCC AGAGCTACCC AACAGATTCC 300 

TATGGAGTTC TTGAATTCCA GGGTGGCGGA TATTCCAATA AAGCC ATG TAT ATC 354 

Met Tyr lie 

1 

CGT GTA TCC TAT GAC ACC AAG CCA GAC TCA CTG CTC CAT CTC ATG GTO 402 
Arg Val Ser Tyr Asp Thr Lys Pro Asp Ser Leu Leu His Leu Met Val 
5 10 15 

AAA GAT TGG CAG CTG GAA CTC CCC AAG CTC TTA ATA TCT GTG CAT GGA 450 
Lye Asp Trp Gin Leu Glu Leu Pro Lys Leu Leu lie Ser Val His Gly 
20 25 30 35 

GGC CTC CAG AAC TTT GAG ATG CAG CCC AAG CTG AAA CAA GTC TTT GGG 498 
Gly Leu Gin Asn Phe Glu Met Gin Pro Lys Leu Lys Gin Val Phe Gly 
40 45 50 

AAA GGC CTG ATC AAG GCT GCT ATG ACC ACC GGG GCC TGG ATC TTC ACC 546 
Lys Gly Leu lie Lys Ala Ala Met Thr Thr Gly Ala Trp lie Phe Thar 
55 60 65 

GGG GGT GTC AGC ACA GGT GTT ATC AGC CAC GTA GGG GAT GCC TTG AAA 594 
Gly Gly Val Ser Thr Gly Val He Ser His Val Gly Asp Ala Leu Ly» 
70 75 80 

GAC CAC TCC TCC AAG TCC AGA GGC CGG GTT TGT GCT ATA GGA ATT GCT 642 
Asp His Ser Ser Lys Ser Arg Gly Arg Val Cys Ala He Gly He Ala 
85 90 95 
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CCA TGG GCC ATC GTG GAG AAT AAG GAA GAC CTG GTT GGA AAG GAT GTA 690 
Pro Trp Gly lie Val Glu Aan Lye Glu Asp Leu Val Gly Lys Asp Val 
100 105 110 115 

ACA AGA GTG TAC CAG ACC ATG TCC AAC CCT CTA AGT AAG CTC TCT GTG 738 
Thr Arg Val Tyr Gin Thr Met Ser Aan Pro Leu Ser Lys Leu Ser Val 
120 125 130 

CTC AAC AAC TCC CAC ACC CAC TTC ATC CTG GCT GAC AAT GGC ACC CTG 786 
Leu Ann Aan Ser His Thr Hie Phe lie Leu Ala Asp Aen Gly Thr Leu 
135 140 145 

GGC AAG TAT GGC GCC GAG GTG AAG CTG CGA AGG CTG CTG GAA AAG CAC 834 
Gly Lys Tyr Gly Ala Glu Val Lys Leu Arg Arg Leu Leu Glu Lys His 
150 155 160 

ATC TCC CTC CAG AAG ATC AAC ACA AGA CTG GGG CAG GGC GTG CCC CTC 882 
lie Ser Leu Gin Lys He Aan Thr Arg Leu Gly Gin Gly Val Pro Leu 
165 170 175 

GTG GGT CTC GTG GTG GAG GGG GGC CCT AAC GTG GTG , TCC ATC GTC TTG 930 
Val Gly Leu Val Val Glu Gly Gly Pro Asn Val Val Ser He Val Leu 
180 185 190 195 

GAA TAC CTG CAA GAA GAG CCT CCC ATC CCT GTG GTG ATT TCT GAT GGC 978 
Glu Tyr Leu Gin Glu Glu Pro Pro He Pro Val Val He Cys Asp Gly 
200 205 210 

AGC GGA CGT GCC TCG GAC ATC CTG TCC TTT GCG CAC AAG TAC TGT GAA 1026 
Ser Gly Arg Ala Ser Asp He Leu Ser Phe Ala His Lys Tyr Cys Glu 
215 220 225 

CAA GCC GGA ATA ATA AAT GAC TCC CTC AGG GAG CAG CTT CTA GTT ACC 1074 
Glu Gly Gly He He Aon Glu Ser Leu Arg Glu Gin Leu Leu Val Thr 
230 235 240 

ATT CAG AAA ACA TTT AAT TAT AAT AAG GGA CAA TCA CAT CAG CTG TTT 1122 
He Gin Lys Thr Phe Asn Tyr Asn Lys Ala Gin Ser His Gin Leu Phe 
245 250 255 

CCA ATT ATA ATG GAG TGC ATC AAG AAG AAA GAA CTC GTC ACT GTG TTC 1170 
Ala He He Met Glu Cys Met Lys Lye Lys Glu Leu Val Thr Val Phe 
260 265 270 275 

AGA ATG GGT TCT GAG GGC CAG CAG GAC ATC GAG ATG GCA ATT TTA ACT 1218 
Arg Met Gly Ser Glu Gly Gin Gin Asp He Glu Met Ala He Leu Thr 
280 285 290 

GCC CTG CTG AAA GGA ACA AAC GTA TCT GCT CCA GAT CAG CTG AGC TTG 1266 
Ala Leu Leu Lys Gly Thr Asn Val ser Ala Pro Asp Gin Leu Ser Leu 
295 300 305 

GCA CTG GCT TGG AAC CGC GTG GAC ATA GCA CGA AGC CAG ATC TTT GTC 1314 
Ala Leu Ala Trp Asn Arg Val Asp He Ala Arg Ser Gin He Phe Val 
310 315 320 

TTT GGG CCC CAC TGG ACC CCC CTG GGA AGC CTG GCA CCC CCG ACC GAC 1362 
Phe Gly Pro His Trp Thr Pro Leu Gly Ser Leu Ala Pro Pro Thr Asp 
325 330 335 

AGC AAA GCC ACG GAG AAG GAG AAG AAG CCA CCC ATG GCC ACC ACC AAG 1410 
Ser Lys Ala Thr Glu Lys Glu Lys Lys Pr Pro Met Ala Thr Thr Lys 
340 345 350 355 
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GGA GGA AGA GGA AAA GGG AAA GCC AAG AAG AAA GGG AAA GTG AAA GAG 1458 

Gly Gly Arg Gly Lye Gly Ly» Gly Lys Lye Lye Gly Lye Val Lye Glu 

360 365 370 

GAA GTG GAG GAA GAA ACT GAC CCC CGG AAG ATA GAG CTG CTG AAC TGG 1506 
Glu Val Glu Glu Glu Thr Asp Pro Arg Lye He Glu Lau Leu Aen Trp 
375 380 385 

GTG AAT GCT TTG GAG CAA GCG ATG CTA GAT GCT TTA GTC TTA GAT CGT 1554 
Val Aon Ala Leu Glu Gin Ala Met Leu Aep Ala Leu Val Leu Aap Arg 
390 395 400 

GTC GAC TTT GTG AAG CTC CTG ATT GAA AAC GGA GTG AAC ATG CAA CAC 1602 
Val Asp Phe Val Lye Leu Leu He Glu Aen Gly Val Aen Met Gin Hie 
405 410 415 

TTT CTG ACC ATT CCG AGG CTG GAG GAG CTT TAT AAC AGA AGA CTG GGT 1650 
Phe Leu Thr He Pro Arg Leu Glu Glu Leu Tyr Aen Thr Arg Leu Gly 
420 425 430 435 

CCA CCA AAC ACA CTT CAT CTG CTG GTG AGG GAT GTG AAA AAG AGC AAC 1698 
Pro Pro Aen Thr Leu Hie Leu Leu Val Arg Aep Val Lye Lye Ser Am 
440 445 450 

CTT CCG CCT GAT TAG CAC ATC ACC CTC ATA GAC ATC GGG CTC GTG CTG 1746 
Leu Pro Pro Aep Tyr Hie lie Ser Leu He Asp He Gly Leu Val Leu 
455 460 465 

GAG TAC CTC ATG GGA GGA GCC TAC CGC TGC AAC TAG ACT CGG AAA AAC 1794 
Glu Tyr Leu Met Gly Gly Ala Tyr Arg Cye Asn Tyr Thr Arg Lys Aen 
470 475 480 

TTT CGG ACC CTT TAC AAC AAC TTG TTT GGA CCA AAG AGG CCT AAA GCT 1842 
Phe Arg Thr Leu Tyr Asn Asn Leu Phe Gly Pro Lys Arg Pro Lys Ala 
485 490 495 

CTT AAA CTT CTG GGA ATG GAA GAT GAT GAG CCT CCA GCT AAA GGG AAG 1890 
Leu Lye Leu Leu Gly Met Glu Asp Asp Glu Pro Pro Ala Lys Gly Lys 
500 505 510 515 

AAA AAA AAA AAA AAG AAA AAG GAG GAA GAG ATC GAC ATT GAT GTG GAC 1938 
Lys Lys Lys Lye Lys Lys Lye Glu Glu Glu He Asp He Asp Val Asp 
520 525 530 

GAC CCT GCC GTG AGT CGG TTC CAG TAT CCC TTC CAC GAG CTG ATG GTG 1986 
Asp Pro Ala Val Ser Arg Phe Gin Tyr Pro Phe His Glu Leu Met Val 
535 540 545 

TGG GGA GTG CTG ATG AAA CGC CAG AAA ATG GGA GTG TTC CTC TGG CAG 2034 
Trp Ala Val Leu Met Lys Arg Gin Lys Met Ala Val Phe Leu Trp Gin 
550 555 560 

CGA GGG GAA GAG AGC ATC GCC AAG GCC CTC CTG GCC TGC AAG CTC TAC 2082 
Arg Gly Glu Glu Ser Met Ala Lys Ala Leu Val Ala Cys Lye Leu Tyr* 
565 570 575 

AAG GCC ATG GCC CAC GAG TCC TCC GAG AGT GAT CTG GTG GAT CAC ATC 2130 
Lye Ala Met Ala His Glu Ser Ser Glu Ser Asp Leu Val Asp Aep He 
580 565 590 59S 

TCC CAG GAC TTG GAT AAC AAT TCC AAA GAC TTC GGC CAG CTT GCT TTG 2178 
Ser Gin Asp L u Asp Asn Asn Ser Lys Asp Ph Gly Gin Leu Ala Leu 
600 605 610 
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GAG TTA TTA GAC CAG TCC TAT AAG CAT GAC GAG CAG ATC GCT ATG AAA 2226 
Glu Leu Leu Asp Gin Ser Tyr Lye His Asp Glu Gin lie Ala Met Lye 
615 620 62S 

CTC CTG ACC TAC GAG CTG AAA AAC TOO AGC AAC TOG ACC TGC CTC AAA 2274 
Leu Leu Thr Tyr Glu Leu Lye Asn Trp Ser Ann Ser Thr Cye Leu Lve 
630 63S 640 

CTG GCC GTG CCA GCC AAA CAC CCC GAC TTC ATT GCT CAC ACC TGC AGC 2322 
Leu Ala val Ala Ala Lye Hie Arg A.p Phe He Ala Hie Thr Cye Ser 
o«5 650 655 

CAG ATG CTG CTG ACC GAT ATG TGG ATG CCA AGA CTG COG ATG COG AAG 2370 
Gin Met Leu Leu Thr Asp Met Trp Met Gly Arg Leu Arg Met Arg Lye 
660 665 670 675 

AAC CCC GGC CTG AAG GTT ATC ATC CCG ATT CTT CTA CCC CCC ACC ATC 2418 
Asn Pro Gly Leu Lye Val lie Met Gly He Leu Leu Pro Pro Thr lie 
680 685 690 

TTG TTT TTG GAA TTT CCC ACA TAT GAT GAT TTC TOG TAT CAA ACA TCC 2466 
Leu Phe Leu Glu Phe Arg Thr Tyr Aep Asp Phe Ser Tyr Gin Thr Ser 
695 700 705 

AAG GAA AAC GAG GAT GGC AAA GAA AAA GAA GAG GAA AAT ACC GAT OCA 2514 
Lys Glu Asn Glu Asp Gly Lys Glu Lys Glu Glu Glu Asn Thr Asp Ala 
710 715 7 20 

AAT GCA GAT GCT GGC TCA AGA AAG GGC GAT GAG GAG AAC GAG CAT AAA 2562 
Asn Ala Aep Ala Gly Ser Arg Lys Gly Asp Glu Glu Asn Glu His Lys 
725 730 735 

*** ? GA A ?° CCC ATC °«* A CA AAA ATC TGT AAA TTC TAT AAC 2610 

Lys Gin Arg lie lie Pro He Gly Thr Lys He Cy. Lys Phe Tyr Asn 
740 7 *5 750 755 

GCC CCC ATT GTC AAG TTC TGC TTT TAC ACA ATA TCA TAC TTG GGC TAC 2658 
Ala Pro lie Val Lye Phe Trp Phe Tyr Thr He Ser Tyr Leu Gly Tyr 
? 60 765 770 

CTG CTG CTG TTT AAC TAC GTC ATC CTG GTC COG ATG GAT GGC TGG CCG 2706 
Leu Leu Leu Phe Aen Tyr Val He Leu Val Arg Met Asp Gly Trp Pro 
775 780 785 

TCC CTC CAG GAG TGG ATC GTC ATC TCC TAC ATC GTG AGC CTG GCC TTA 2754 
Ser Leu Gin Glu Trp He Val He Ser Tyr He Val Ser Leu Ala Leu 
790 795 800 

GAG AAG ATA CGA GAG ATC CTC ATG TCA GAA CCA GGC AAA CTC AGC CAG 2802 
Glu Lys He Arg Glu He Leu Met Ser Glu Pro Gly Lys Leu Ser Gin 
B05 810 815 

AAA ATC AAA GTT TGG CTT CAG GAG TAC TOG AAC ATC ACA GAT CTC GTG 2850 
Lys He Lys Val Trp Leu Gin Glu Tyr Trp Asn He Thr Asp Leu Val 
820 825 830 835 

GCC ATT TCC ACA TTC ATG ATT GGA GCA ATG GCC ACC AGA TCT GTG ATG 2898 
Ala He ser Thr Phe Met He Gly Ala Met Ala Thr Arg Ser Val Met 
8 *0 845 850 

ATG ATT GGA AAG ATG ATG ATC GAC ATC CTG TAC TTT GTG GTC ATC ATG 2946 
Met 11 Gly Lys Met Met lie Asp Met Leu Tyr Ph Val Val II Met 
855 860 865 
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CTC GTC GTO CTC ATO ACT TTC CGA CTA GCC COT CAA OCC ATT CTC CAT 
Leu Val Val Leu Met Ser Ph. Gly Val Ala Arg Cln Ala lie Leu His 
870 875 880 

o~ ^ C ^ CC TCT AAA CTC GCC CGA AAC ATC TTC TAC ATG 

Pro Glu Glu Lya Pro Ser Trp Lye Leu Ala Arg Asn He Phe Tyr Met 
885 890 895 

CCC TAC TGO ATO ATC TAT GOA GAG GTO TTT CCA CAC GAG ATA GAC CTC 
Pro Tyr Trp Met He Tyr Gly Glu Val Phe Ala Aap Gin lie Aep Leu 

' uu 9 °5 910 915 

TAC GCC ATO GAA ATT AAT CCT CCT TGT GOT GAG AAC CTA TAT GAT GAC 
Tyr Ala Met Glu lie Aan Pro Pro Cya Gly Glu Asn Leu Tyr Aap Glu 
920 925 930 

GAG GCC AAG COG CTT CCT CCC TGT ATC CCC GCC GCC TGG CTC ACT CCA 
Glu Gly Lya Arg Leu Pro Pro Cya He Pro Gly Ala Trp Leu Thr Pro 
935 940 945 

CCA CTC ATO GOG TGC TAT CTA CTG GTC GCC AAC ATC CTG CTO GTO AAC 
Ala Leu Met Ala Cya Tyr Leu Leu Val Ala Asn He Leu Leu Val Aan 
950 955 960 

CTG CTG ATT OCT CTG TTC AAC AAT ACC TTC TTT GAA CTA AAA TCA ATA 
Leu Leu He Ala Val Phe Aan Aan Thr Phe Phe Glu Val Lya Ser He 
965 970 97s 

TCC AAC CAG GTO TGG AAG TTC CAG CGA TAT CAG CTO ATT ATG ACA TTT 
Ser Aan Gin Val Trp Lya Phe Gin Arg Tyr Gin Leu He Met Thr Phe 
' ou 985 990 995 

CAT GAC AGO CCA GTC CTG CCC CCA CCG ATG ATC ATT TTA ACC CAC ATC 
Hia Aap Arg Pro Val Leu Pro Pro Pro Met He He Leu Ser Hia He 
1000 1005 1010 

™ C A *f A *f ATG COT CTC AGC CGC CCC TGC AGO AAA AAG AGA CAA 

Tyr He He He Met Arg Leu Ser Gly Arg Cya Arg Lya Lya Arg Glu 
1015 1020 1025 

CCG GAC CAA GAG GAA CGC CAT CCT CCA TTC AAG CTC TTC CTT AGC GAC 
Gly Aap Gin Glu Glu Arg Aap Arg Gly Leu Lya Leu Phe Leu Ser Aap 
1030 1035 1040 

GAG <* AG CTA AAG AGG CTC CAT GAG TTC GAG GAG CAG TGC GTC CAG GAG 

C i? c Leu LyB ^ ^ HiB Glu Pho Glu °1« Gln Cya Val Gin Glu 
1045 1050 1055 

CAC TTC COO GAO AAG CAO GAT GAC CAG CAG TCG TCC AGC GAC GAG CGC 

Glu Ly * 61u A,p Glu Gln Gln s « Ser Ser Aap Glu Arg 
1060 1065 1070 1075 

A Tf f 00 2*? GAA m GTT GAA *** A «S TCA ATO AGG TTC CAA 

He Arg Val Thr Ser Glu Arg Val Glu Aan Met Ser Met Arg Leu Olu 
1080 1085 1090 

GAA ATC AAT GAA AGA GAA ACT TTT ATG AAA ACT TCC CTG CAG ACT GTT 
Glu He Aan Olu Arg Glu Thr Phe Met Lya Thr Ser Leu Oln Thr Val 
1095 iioo H05 

GAC CTT CGA CTT OCT CAO CTA GAA CAA TTA TCT AAC AGA ATG GTC AAT 
Aap Leu Arg Leu Ala Gln Leu Glu Glu Leu S r Aan Arg Met Val Aan 
1110 HIS H20 



2994 



3042 



3090 



3138 



3186 



3234 



3282 



3330 



3378 



3426 



3474 



3522 



3570 



3618 



3666 



3714 
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GCT CTT GAA AAT CTT GCG GGA ATC 6 AC AGG TCT GAC CTG ATC CAG GCA 3762 
Ala L u Glu Asn L u Ala Gly lie Asp Arg Ser Asp Leu lie Gin Ala 
1125 1130 1135 

CGG TCC CGG GCT TCT TCT GAA TGT GAG GCA ACG TAT CTT CTC CGG CAA 3810 
Arg Ser Arg Ala Ser Ser Glu Cys Glu Ala Thr Tyr Leu Leu Arg Gin 
1140 1145 1150 1155 

AGO AGC ATC AAT AGC GCT GAT GGC TAC AGC TTG TAT CGA TAT CAT TTT 3858 
Ser Ser He Asn Ser Ala Asp Gly Tyr Ser Leu Tyr Arg Tyr His Phe 
1160 1165 1170 

AAC GGA GAA GAG TTA TTA TTT GAG GAT ACA TCT CTC TCC ACG TCA CCA 3906 
Asn Gly Glu Glu Leu Leu Phe Glu Asp Thr Ser Leu Ser Thr Ser Pro 
1175 1180 1185 

GGG ACA GGA GTC AGG AAA AAA ACC TGT TCC TTC CGT ATA AAG GAA GAG 3954 
Gly Thr Gly Val Arg Lys Lye Thr Cys Ser Phe Arg He Lys Glu Glu 
1190 1195 1200 

AAG GAC GTG AAA ACG CAC CTA GTC CCA GAA TGT CAG AAC AGT CTT CAC 4002 
Lys Asp Val Lys Thr His Leu Val Pro Glu Cys Gin Asn Ser Leu His 
1205 1210 1215 

CTT TCA CTG GGC ACA AGC ACA TCA GCA ACC CCA GAT GGC AGT CAC CTT 4050 
Leu Ser Leu Gly Thr Ser Thr Ser Ala Thr Pro Asp Gly Ser His Leu 
1220 1225 1230 1235 

GCA GTA GAT GAC TTA AAG AAC GCT GAA GAG TCA AAA TTA GGT CCA GAT 4098 
Ala Val Asp Asp Leu Lys Asn Ala Glu Glu Ser Lys Leu Gly Pro Asp 
1240 1245 1250 

ATT GGG ATT TCA AAG GAA GAT GAT GAA AGA CAG ACA GAC TCT AAA AAA 4146 
He Gly He Ser Lys Glu Asp Asp Glu Arg Gin Thr Asp Ser Lys Lys 
1255 1260 1265 

GAA GAA ACT ATT TCC CCA AGT TTA AAT AAA ACA GAT GTG ATA CAT GGA 4194 
Glu Glu Thr He Ser Pro Ser Leu Asn Lys Thr Asp Val He His Gly 
1270 1275 1280 

CAG GAC AAA TCA GAT GTT CAA AAC ACT CAG CTA ACA GTG GAA ACG ACA 4242 
Gin Asp Lys Ser Asp Val Gin Asn Thr Gin Leu Thr Val Glu Thr Thr 
1285 1290 1295 

AAT ATA GAA GGC ACT ATT TCC TAT CCC CTG GAA GAA ACC AAA ATT ACA 4290 
Asn He Glu Gly Thr He Ser Tyr Pro Leu Glu Glu Thr Lys He Thr 
1300 1305 1310 1315 

CGC TAT TTC CCC GAT GAA ACG ATC AAT GCT TGT AAA ACA ATG AAG TCC 4338 
Arg Tyr Phe Pro Asp Glu Thr He Asn Ala Cys Lys Thr Met Lys Ser 
1320 1325 1330 

ACA AGC TTC GTC TAT TCC CGG GGA AGA AAG CTG GTC GGT GGG GTT AAC 4386 
Arg Ser Phe Val Tyr Ser Arg Gly Arg Lys Leu Val Gly Gly Val Asn 
1335 1340 1345 

CAG GAT GTA GAG TAC AGT TCA ATC ACG GAC CAG CAA TTG ACG ACG GAA 4434 
Gin Asp Val Glu Tyr Ser Ser He Thr Asp Gin Gin Leu Thr Thr Glu 
1350 1355 1360 

TOG CAA TGC CAA GTT CAA AAG ATC ACG CGC TCT CAT AGC ACA GAT ATT 4482 
Trp Gin Cys Gin Val In Lys II Thr Arg S r His Ser Thr Asp He 
1365 1370 1375 
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CCT TAC 
Pr Tyr 
1380 


ATT 
Ila 


GTG 
Val 


TCG 
Ser 


GAA GCT 
Glu Ala 
1385 


GCA 
Ala 


GTG 
Val 


CAA 

Gin 


GCT GAG 
Ala Glu 
1390 


CAA 

Gin 


AAA GAG GAG 
Lya Glu Gin 
1395 


TTT 
Phe 


GCA 

Ala 


GAT 
Aap 


ATG 
Mat 


CAA GAT 
Gin Aap 
1400 


GAA 
Glu 


CAC 
His 


CAT 
HiB 


GTC GCT 
Val Ala 
1405 


GAA 

GlU 


GCA 

Ala 


ATT 
He 


CCT OGA 
Pro Arg 
1410 


ATC 
Ila 


CCT 
Pro 


CGC 
Arg 


TTG TCC 
Lau Sar 
1415 


CTA 
Leu 


ACC 
Thr 


ATT 
lie 


ACT GAC 
Thr ABp 
1420 


AGA 
Arg 


AAT 
Aan 


GGG 
Gly 


ATG GAA 
Met Glu 
1425 


AAC 
Aan 


TTA 
Leu 


CTG 
Leu 


TCT GTG 
Sar Val 
1430 


AAG 
Lya 


CCA 
Pro 


GAT 
Aap 


CAA ACT 
Gin Thr 
1435 


TTG 
Leu 


GGA 
Gly 


TTC 
Phe 


CCA TCT 
Pro Ser 
1440 


CTC 
Leu 


AGG 
Arg 


TCA 
ser 


AAA AGT 
Lya Sar 
1445 


TTA 
Leu 


CAT 
HiB 


GGA 
Gly 


CAT CCT 
His Pro 
1450 


AGG 
Arg 


AAT 
Asn 


GTG 
Val 


AAA TCC 
Lye Ser 
1455 


ATT 
He 


GAG 
Gin 


GGA 

Gly 


AA6 TTA 
Lye Leu 
1460 


GAC 
Aap 


AGA 
Arg 


TCT 
Ser 


GGA CAT 
Gly HiB 
1465 


GCC 
Ala 


AGT 
Ser 


AGT 
Ser 


GTA AGC 
Val Ser 
1470 


AGC 
Ser 


TTA 
Leu 


GTA 
val 


ATT 
He 

1475 


GTG 
Val 


TCT GGA 
Ser Gly 


ATG 
Mat 


AGA GCA 
Thr Ala 
1480 


GAA 

Glu 


GAA 
Glu 


AAA 

Lye 


AAG GTT 
Lye Val 
1485 


AAG 
Lya 


AAA 

LyB 


GAG 
Glu 


AAA GCT 
Lys Ala 
1490 



4530 



4578 



4626 



4674 



4722 



4770 



4818 



TCC AGA GAA ACT GAA TCC T AGTCTGTTTT GTT TC TTTAA TTTTTTTTTT 4867 
Ser Thr Glu Thr Glu Cya 
1495 

TAACAGTCAG AACCACTAAT GGGTGTCATC TTGGCCATCC TAAACATCCA TCCAATTTCC 4927 
TAAAAACATT TTCCCTT 4944 
(2) INFORMATION FOR SEQ ID NOs7: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1497 amino acids 

(B) TYPEs amino acid 
(D) TOPOLOGY* linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7$ 

Met Tyr He Arg Val Ser Tyr Asp Thr Lye Pro Aap Ser Leu Leu Hi» 
15 10 15 

Leu Met Val Lya Aap Trp Gin Leu Glu Leu Pro Lya Leu Leu He Sep 
20 25 30 

Val Kia Gly Gly Leu Gin Asn Phe Glu Met Gin Pro Lya Leu Lys Gin 
35 40 45 

Val Phe Gly Lya Gly Leu He Lys Ala Ala Met Thr Thr Gly Ala Trp 
50 55 60 

He Phe Thr Gly Gly Val Ser Thr Gly Val He Ser Hia Val Gly Aap 
65 70 75 80 

Ala Leu Lya Aap HiB Ser Ser Lys Ser Arg Gly Arg Val Cya Ala II* 

85 90 95 
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Gly lie Ala Pro Trp Cly II Val Glu Aan Lye Glu Asp Leu Val Gly 
100 105 no 

Lye Aap Val Thr Arg Val Tyr Gin Thr Met Ser Aan Pro Leu Ser Lye 
115 120 125 

Leu Ser Val Leu Aan Ann Ser His Thr His Phe He Leu Ala Asp Asn 

130 135 140 

Gly Thr Leu Gly Lys Tyr Gly Ala Glu Val Lys Leu Arg Arg Leu Leu 
145 150 155 160 

Glu Lys His He Ser Leu Gin Lys He Asn Thr Arg Leu Gly Gin Civ 
165 170 175 

Val Pro Leu Val Gly Leu Val Val Glu Gly Gly Pro Asn Val Val Ser 
180 185 190 

He Val Leu Glu Tyr Leu Gin Glu Glu Pro Pro He Pro Val Val He 
195 200 205 

Cys Asp Gly Ser Gly Arg Ala Ser Asp He Leu Ser Phe Ala Hie Lys 
210 215 220 

Tyr Cys Glu Glu Gly Cly He He Asn Glu Ser Leu Arg Glu Gin Leu 
225 230 235 240 

Leu Val Thr He Gin Lys Thr Phe Asn Tyr Asn Lys Ala Gin Ser His 
245 250 255 

Gin Leu Phe Ala He He Met Glu Cys Met Lys Lys Lys Glu Leu Val 
260 265 270 

Thr Val Phe Arg Met Gly Ser Glu Gly Gin Gin Asp He Glu Met Ala 
275 280 285 

He Leu Thr Ala Leu Leu Lys Gly Thr Asn Val Ser Ala Pro Asp Gin 
290 295 300 

Leu Ser Leu Ala Leu Ala Trp Asn Arg Val Asp He Ala Arg Ser Gin 

310 315 320 

He Phe Val Phe Gly Pro His Trp Thr Pro Leu Gly Ser Leu Ala Pro 
325 330 335 

Pro Thr Asp Ser Lys Ala Thr Glu Lys Glu Lys Lys Pro Pro Met Ala 
340 345 350 

Thr Thr Lys Gly Gly Arg Gly Lys Gly Lys Gly Lys Lys Lys Gly Lys 
355 360 365 

Val Lys Glu Glu Val Glu Glu Glu Thr Asp Pro Arg Lys He Glu Leu 
370 375 380 

Leu Aon Trp Val Asn Ala Leu Glu Gin Ala Met Leu Aap Ala Leu Val 
3 « 390 395 400 

Leu Asp Arg Val Asp Phe Val Lys Leu Leu He Glu Asn Gly Val Asn 
405 410 415 

Met Gin His Phe Leu Thr He Pro Arg Leu Glu Glu Leu Tyr Asn Thr 
420 425 430 
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Arg Leu Gly Pro Pr Asn Thr Leu His Leu Leu Val Arg Asp Val Lys 
435 440 445 

Lys Ser Asn Leu Pro Pro Asp Tyr His lie Ser Leu lie Asp lie Gly 
450 455 460 

Leu Val Leu Glu Tyr Leu Met Gly Gly Ala Tyr Arg Cys Asn Tyr Thr 
465 470 475 480 

Arg Lys Asn Phe Arg Thr Leu Tyr Asn Asn Leu Phe Gly Pro Lys Arg 
465 490 495 

Pro Lys Ala Leu Lys Leu Leu Gly Met Glu Asp Asp Glu Pro Pro Ala 
500 505 510 

Lys Gly Lye Lys Lys Lys Lys Lys Lys Lys Glu Glu Glu lie Asp He 
515 520 525 

Asp Val Asp Asp Pro Ala Val Ser Arg Phe Gin Tyr Pro Phe His Glu 
530 535 540 

Leu Met Val Trp Ala Val Leu Met Lys Arg Gin Lys Met Ala Val Phe 
545 550 555 560 

Leu Trp Gin Arg Gly Glu Glu Ser Met Ala Lys Ala Leu Val Ala Cye 
565 570 575 

Lys Leu Tyr Lys Ala Met Ala His Glu Ser Ser Glu Ser Asp Leu Val 
580 585 590 

Asp Asp He Ser Gin Asp Leu Asp Asn Asn Ser Lys Asp Phe Gly Gin 
595 600 605 

Leu Ala Leu Glu Leu Leu Asp Gin Ser Tyr Lys His Asp Glu Gin lie 
610 615 620 

Ala Met Lys Leu Leu Thr Tyr Glu Leu Lys Asn Trp Ser Asn Ser Thr 

625 630 635 640 

Cys Leu Lys Leu Ala Val Ala Ala Lys His Arg Asp Phe He Ala Hi a 
645 650 655 

Thr Cys Ser Gin Met Leu Leu Thr Asp Met Trp Met Gly Arg Leu Arg 
660 665 670 

Met Arg Lys Asn Pro Gly Leu Lys Val He Met Gly He Leu Leu Pro 
675 680 685 

Pro Thr He Leu Phe Leu Glu Phe Arg Thr Tyr Asp Asp Phe Ser Tyr 
690 695 700 

Gin Thr Ser Lys Glu Asn Glu Asp Gly Lys Glu Lys Glu Glu Glu Asn 
705 710 715 720 

Thr Asp Ala Asn Ala Asp Ala Gly Ser Arg Lys Gly Asp Glu Glu Asn 

725 730 735 

Glu His Lys Lys Gin Arg He He Pro He Gly Thr Lys He Cys Lye 
740 745 750 

Phe Tyr Asn Ala Pr He Val Lys Phe Trp Ph Tyr Thr He Ser Tyr 

755 760 765 
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Leu Gly Tyr Leu Leu Leu Phe Asn Tyr Val He Leu Val Arg Met Asp 

770 775 780 

Gly Trp Pro Ser Leu Gin Glu Trp lie Val lie Ser Tyr lie Val Ser 
785 790 795 800 

Leu Ala Leu Glu Lye lie Arg Glu lie Leu Met Ser Glu Pro Gly Lye 
805 810 815 

Leu Ser Gin Lys lie Lys Val Trp Leu Gin Glu Tyr Trp Aan lie Thr 
820 825 830 

Asp Leu Val Ala lie Ser Thr Phe Met He Gly Ala Met Ala Thr Arg 
835 840 845 

Ser Val Met Met He Gly Lye Met Met Xle Aep Met Leu Tyr Phe Val 
850 855 860 

Val lie Met Leu Val Val Leu Met Ser Phe Gly Val Ala Arg Gin Ala 
865 870 875 880 

He Leu Hie Pro Glu Glu Lys Pro Ser Trp Lye Leu Ala Arg Aen lie 
885 890 895 

Phe Tyr Met Pro Tyr Trp Met He Tyr Gly Glu Val Phe Ala Asp Gin 
900 905 910 

lie Asp Leu Tyr Ala Met Glu lie Asn Pro Pro Cys Gly Glu Aen Leu 
915 920 925 

Tyr Asp Glu Glu Gly Lye Arg Leu Pro Pro Cys He Pro Gly Ala Trp 
930 935 940 

Leu Thr Pro Ala Leu Met Ala Cye Tyr Leu Leu Val Ala Aen He Leu 
945 950 955 960 

Leu Val Asn Leu Leu He Ala Val Phe Asn Asn Thr Phe Phe Glu Val 

965 970 975 

Lys Ser He Ser Asn Gin Val Trp Lys Phe Gin Arg Tyr Gin Leu He 
980 985 990 

Met Thr Phe His Asp Arg Pro Val Leu Pro Pro Pro Met He He Leu 
995 1000 1005 

Ser His He Tyr He He He Met Arg Leu Ser Gly Arg Cys Arg Lys 
1010 1015 1020 

Lys Arg Glu Gly Asp Gin Glu Glu Arg Asp Arg Gly Leu Lys Leu Phe 

1025 1030 1035 1040 

Leu Ser Asp Glu Glu Leu Lys Arg Leu His Glu Phe Glu Glu Gin Cys 
1045 1050 1055 

Val Gin Glu His Phe Arg Glu Lys Glu Asp Glu Gin Gin Ser Ser Ser 
1060 1065 1070 

Asp Glu Arg He Arg Val Thr Ser Glu Arg Val Glu Asn Met Ser Met 
1075 1080 1085 

Arg Leu Glu Glu He Asn Glu Arg Glu Thr Ph Met Lys Thr Ser Leu 
1090 1095 1100 
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Gin Thr Val Asp Leu Arg Leu Ala Gin Leu Glu Glu Leu ser Aan Arg 

1105 111° 1115 112 

Met Val Aan Ala Leu Glu Aan Leu Ala Gly lie Asp Arg Ser Asp Leu 
1125 1130 1135 

lie Gin Ala Arg Ser Arg Ala Ser Ser Glu Cya Glu Ala Thr Tyr Leu 
1140 1145 1150 

Leu Arg Gin Ser Ser lie Aan Ser Ala Aap Gly Tyr fer Leu Tyr Arg 
1160 lies 



1155 



Tyr Kia Phe Aan Gly Glu Glu Leu Leu Pbe Glu Asp Thr Ser Leu Ser 
1170 1175 1180 

Thr Ser Pro Gly Thr Gly Val Arg Lye Lya Thr Cya Ser Phe Arg lie 
1185 1190 H95 "GO 

Lya Glu Glu Lya Aap Val Lya Thr Hie Leu Val Pro Glu Cya Gin Mn 
1205 1210 1215 

Ser Leu Hia Leu Ser Leu Gly Thr Ser Thr Ser Ala Thr Pro Aap Gly 
1220 f 1225 1Z3U 

Ser Hta Leu Ala Val Aap Aap Leu Lya Aan Ala Glu Glu Ser Lya Leu 
1235 1240 1245 

Gly Pro Aap lie Gly lie Ser Lye Glu Aap Aap Glu Arg Gin Thr Aap 
1250 1255 1260 

Ser Lya Lya Glu Glu Thr lie Ser Pro Ser Leu Aan Lya Thr Aap Val 
1265 1270 1275 12B0 

lie Hia Gly Gin Asp Ly. Ser Asp Val Gin Asn Thr Gin Leu Thr Val 
1285 1290 1295 

Glu Thr Thr Aan lie Glu Gly Thr He Ser Tyr Pro Leu Glu Glu Thr 
1300 1305 1310 

Lya He Thr Arg Tyr Phe Pro Aap Glu Thr He Aan Ala Cya Lys Thr 
1315 1320 1325 

Met Lys ser Arg Ser Phe Val Tyr Ser Arg Gly Arg Lya Leu Val Gly 
1330 1335 1340 

Gly Val Aen Gin Asp Val Glu Tyr Ser Ser lie Thr Aap Gin Gin Leu 
1.345 1350 1355 "60 

Thr Thr Glu Trp Gin Cya Gin Val Gin Lya lie Thr Arg ser His Ser 
1365 1370 1375 

Thr Aap He Pro Tyr He Val ser Glu Ala Ala Val Gin Ala Glu Gin 
1380 1385 1390 

Lya Glu Gin Phe Ala Asp Met Gin Aep Glu His Hi. Val Ala Glu Al* 
1395 1400 1*05 

He Pro Arg He Pro Arg Leu Ser Leu Thr He Thr Asp Arg Asn Gly 
1410 1415 1420 

Met lu Asn Leu Leu Ser Val Lya Pro Aap Gin Thr Leu Gly Phe Pro 

1425 1430 1435 14*0 
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Ser Leu Arg Ser Lye Ser Leu His Gly Hie Pro Arg Asn Val Lye Ser 
1445 1450 1455 

He Gin Gly Lye Leu Asp Arg Ser Gly His Ala Ser Ser Val Ser Ser 
1460 1465 1470 

Leu Val He Val Ser Gly Met Thr Ala Glu Glu Lys Lys Val Lys Lys 
1475 1480 1485 

Glu Lye Ala Ser Thr Glu Thr Glu Cys 
1490 1495 

(2) INFORMATION FOR SEQ ID NOtSi 

(1) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 5055 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDED NESS t single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 



(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 346.. 4945 



(xi> SEQUENCE DESCRIPTION: SEQ ID NO: 8: 

ACTCATTATA GGGNTCGAGC GGCCGCCCGG GCAGGTTTGA GCTGTGCCCT CTCCATTCCA 60 

CTGCTGTGGC AGGGTCAGAA ATCTTGGATA GAGAAAACCT TTTGCAAACG GGAATGTATC 120 

TTTGTAATTC CTAGCACGAA AGACTCTAAC AGGTGTTCCT GTGGCCAGTT CACCAACCAG 180 

CATATCCCCC CTCTGCCAAG TGCAACACCC AGCAAAAATG AAGAGGAAAG CAAACAGGTG 240 

GAGACTCAGC CTGAGAAATG GTCTCTTGCC AAGCACACCC AGAGCTACCC AACAGATTCC 300 

TATGGAGTTC TTGAATTCCA GGGTGGCGGA TATTCCAATA AAGCC ATG TAT ATC 354 

Met Tyr He 
1 

CGT GTA TCC TAT GAC ACC AAG CCA GAC TCA CTG CTC CAT CTC ATG GTG 402 
Arg Val Ser Tyr Asp Thr Lya Pro Asp Ser Leu Leu His Leu Met Val 
5 10 15 

AAA GAT TOG CAG CTG GAA CTC CCC AAG CTC TTA ATA TCT GTG CAT GGA 450 
Lvs Asp Trp Gin Leu Glu Leu Pro Lys Leu Leu He Ser Val His Gly 
20 25 30 35 

GCC CTC CAG AAC TTT GAG ATG CAG CCC AAG CTG AAA CAA GTC TTT GGG 498 
Gly Leu Gin Asn Phe Glu Met Gin Pro Lys Leu Lys Gin Val Phe Gly 
40 45 50 

AAA GGC CTG ATC AAG GCT GCT ATG ACC ACC GGG GCC TGG ATC TTC ACC 546 
Lvs Gly Leu He Lys Ala Ala Met Thr Thr Gly Ala Trp He Phe Thr 
55 60 65 
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GGG GOT CTC AOC ACA OCT GTT ATC AOC CAC OTA GCG CAT GCC TTG AAA 594 
Cly Gly Val Ser Thr Gly Val He S r His Val Gly Asp Ala Lou Lye 
70 75 80 

GAC CAC TCC TCC AAG TCC AGA GGC CCC GTT TGT GCT ATA OCA ATT OCT 642 
Asp Hie Ser Ser Lye Ser Arg Gly Arg Val Cys Ala He Gly He Ala 

90 95 



85 



690 



CCA TGG GGC ATC GTG GAG AAT AAG GAA GAC CTC GTT GCA AAG GAT CTA 
Pro Trp Gly He Val Glu Asn Lys Glu Asp Leu Val Gly Lys Aep Val 
100 105 HO H5 

ACA AGA GTG TAC CAG ACC ATG TCC AAC CCT CTA AGT AAG CTC TCT GTG 738 
Thr Arg Val Tyr Gin Thr Met Ser Asn Pro Leu Ser Lys Leu Ser Val 
120 125 13° 

CTC AAC AAC TCC CAC ACC CAC TTC ATC CTG CCT GAC AAT GGC ACC CTG 786 
Leu Asn Aen Ser Hie Thr His Phe He Leu Ala Asp Asn Gly Thr Leu 
135 140 1*5 

GGC AAG TAT GGC GCC GAG GTG AAG CTG CGA AGO CTG CTG GAA AAG CAC 834 
Gly Lys Tyr Gly Ala Glu Val Lys Leu Arg Arg Leu Leu Glu Lye Hie 
3 1 150 155 160 

ATC TCC CTC CAG AAG ATC AAC ACA AGA CTG GGC CAG GGC CTC CCC CTC 
ill Ser Leu Gin Lys He Asn Thr Arg Leu Gly Gin Gly Val Pro Let. 
165 170 I 75 



882 



GTG GOT CTC GTG GTG GAG GGG GGC CCT AAC GTG CTC TCC ATC CTC TTC 930 
val Gly Leu Val Val Glu Gly Gly Pro Aen Val Val Ser He Val Leu 
180 185 190 1»5 

GAA TAC CTG CAA GAA GAG CCT CCC ATC CCT GTG GTG ATT TCT GAT GGC 978 
Glu Tyr Leu Gin Glu Glu Pro Pro He Pro Val Val He Cys Asp Gly 
200 205 210 



1026 



ACC GGA CGT GCC TOG GAC ATC CTG TCC TTT GCG CAC AAG TAC TGT GAA 
ser Gly Arg Ala Ser Asp He Leu Ser Phe Ala Hie Lys Tyr Cys Glu 
* " 215 220 225 

GAA GGC GGA ATA ATA AAT GAC TCC CTC ACC GAC CAG CTT CTA GTT ACC 1074 
Glu Gly Gly He He Asn Glu Ser Leu Arg Glu Gin Leu Leu Val Thr 
230 235 240 

ATT CAG AAA ACA TTT AAT TAT AAT AAG GCA CAA TCA CAT CAG CTG TTT 1122 
lie Gin Lys Thr Phe Asn Tyr Asn Lys Ala Gin Ser His Gin Leu Phe 

245 250 255 

GCA ATT ATA ATG GAG TGC ATG AAG AAG AAA GAA CTC CTC ACT GTG TTC 1170 
Ala He lie Met Clu Cys Met Lys Lys Lys Glu Leu Val Thr Val Phe 
260 265 270 275 

AGA ATG GCT TCT GAG GGC CAG CAG GAC ATC GAG ATG GCA ATT TTA ACT 1218 
Arg Met Gly Ser Glu Gly Gin Gin Asp He Glu Met Ala He Leu Thr 

' 285 2»o 



1266 



280 

GCC CTG CTG AAA GGA ACA AAC GTA TCT GCT CCA CAT CAG CTG AOC TTC 

Ala Leu Leu Lye Gly Thr Asn Val Ser Ala Pro Asp Gin Leu Ser L.u 

295 300 3° 5 

CCA CTG GCT TOG AAC CCC GTG GAC ATA GCA CGA ACC CAG ATC TTT CTC 1314 
Ala Leu Ala Trp Asn Arg Val Asp He Ala Arg Ser Gin II Ph Val 
310 315 320 
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TTT GGG CCC CAC TOO AOO CCC CTG GGA AGC CTO OCA CCC CCG ACO CAC 1362 
Jh. Sly Pro Ht» Trp Thr Pro Leu Oly Ser Leu Ala Pro Pro Thr A.p 
325 330 335 

AOC AAA OCC AOC CAG AAO GAG AAC AAO CCA CCC ATO GCC ACC ACC AAG 1410 
i« ly. SS Thr Glu Ly. Glu Ly. Ly. Pro Pro Met Ala Thr Thr Ly. 

GGA GOA AGA G6A AAA GGG AAA GGC AAG AAC AAA GGG AAA GTG AAA 9*** 1458 
Oly Gly Arg Oly Ly. Gly Ly. Gly Ly. Ly. Ly. Oly Ly. Val Ly. Glu 

OAA CTO GAG CAA OAA ACT GAC CCC COG AAO ATA GAG CTO CTG AAC TOG 1506 
Glu Val Glu Olu Glu Thr A«p Pro Arg Ly. II. Glu Lou Leu A.n Trp 
375 380 388 

GTG AAT OCT TTO GAG CAA GCG ATO CTA GAT OCT TTA CTC TTA CAT OCT 1554 
Val A.n Ala Leu Glu Gin Ala Mat Lou A.p Ala Lou Val Lou Aap Arg 
390 395 400 

GTC GAC TTT GTG AAG CTC CTO ATT OAA AAC CGA GTG AAC ATG CAA CAC 1602 
val A.p Pho Val Ly. Lou Lou lie Glu A.n Gly Val A.n Mot Gin Hi. 
405 410 415 

TTT CTG ACC ATT CCC AW CTO GAG GAG CTT TAT AAC ACA AGA CTG GOT 1650 
Pho Lou Thr llo Pro Arg Leu Glu Glu Lou Tyr A.n Thr Arg Lou Oly 
420 425 430 

CCA CCA AAC ACA CTT CAT CTG CTG GTG AGG GAT GTG AAA AAG AGC AAC 1698 
Pro Pro Aen Thr Lou HI. Leu Leu Val Arg A.p Val Ly. Ly. Ser A.n 
440 445 **° 

CTT CCO CCT GAT TAC CAC ATC AGC CTC ATA GAC ATC GGG CTC GTG CTO 1746 
Lou Pro Pro A.p Tyr Hi. lie Ser Lou lie Aap He Gly Lou Val Leu 
455 460 465 

GAG TAC CTC ATG CCA GGA GCC TAC OCC TGC AAC TAC ACT COG AAA AAC 1794 
Glu Tyr Leu Mot Oly Oly Ala Tyr Arg Cy. A.n Tyr Thr Arg Ly. A«n 

TTT COG ACC CTT TAC AAC AAC TTO TTT GGA CCA AAG AGO CCT AAA OCT 1842 
Phe Arg Thr Lou Tyr Aen A.n Leu Pho Gly Pro Ly. Arg Pro Ly. Ala 
485 490 495 



CTT AAA CTT CTG GGA ATG CAA GAT GAT OAC CCT CCA OCT AAA GOG AAG 1890 
Leu Ly. Leu Leu Gly Met Glu Aep A.p Glu Pro Pro Ala Ly. Gly Ly. 
500 505 510 515 

AAA AAA AAA AAA AAG AAA AAO GAG GAA GAG ATC GAC ATT GAT GTC OAC 1938 
Si Ly. Ly. Ly. Ly. Ly. Ly. Glu Glu Glu II. A.p II. A.p Val A.p 

* 520 525 MW 

GAC CCT OCC OTO ACT CGG TTC CAG TAT CCC TTC CAC GAG CTG ATC GTG 1986 
Imp Pro Ala Val Ser Arg Phe Gin Tyr Pro Phe Hi. Glu Lou Met Val 
535 540 545 

TGC GCA GTC CTG ATO AAA COC CAC AAA ATG OCA GTG TTC CTC TGG CAG 2034 
Trp Ala Val Leu Met Ly. Arg Gin Ly. Met Ala Val Phe Leu Trp Gin 
550 555 560 

CGA GGG GAA GAG AGC ATG GCC AAG GCC CTG GTG GCC TGC AAG CTC TAC 2082 
Arg Gly Glu Glu Ser Met Ala Ly. Ala Leu Val Ala Cy. Ly. Leu Tyr 

565 570 575 
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AAG CCC ATG GCC CAC GAG TCC TCC GAG AGT GAT CTG GTG GAT CAC ATC 
£. Ill Met Ill Hi. Glu S.r 5 r Glu S.r A.p Leu V.l A.p A.p II. 
580 S8S 

-nn n»r CAC TTG GAT AAC AAT TCC AAA GAC TTC GOC CAG CTT OCT TTO 
ser cin xip ™ S E« A.n Ser Ly. A.p Ph. Oly Oln Leu Ala Leu 
600 605 

CAG TTA TTA GAC CAG TCC TAT AAG CAT GAC GAG CAG ATC OCT ATO AAA 
gJS Si Su Aep Gin Ser Tyr Ly. Hi. A. P Glu Gin II. Ala Mat Ly. 

615 620 
r ACC TAC GAG CTG AAA AAC TGG AGC AAC TOG ACC TGC CTC AAA 

£u" n£ t£ llu £S Ly. A.n Trp Ser A.n Ser Thr Cy. Leu Ly. 
630 635 

CTG GCC GTG OCA GCC AAA CAC CGG GAC TTC ATT OCT CAC ACC TGC ACC 
Sa Sal A?a Al. Ly. Hi. Arg A.p Phe Ha Ala Hi. Thr Cy. Ser 
645 650 6SS 

CAG ATC CTG CTG ACC OAT ATG TGG ATG GGA AGA CTG CGG ATO COG AAG 
Sn Sit Leu Leu Thr A.p Met Trp Met Oly Arg Leu Arg Mat Arg Ly. 
660 665 ° 



AAC CCC GGC CTG AAG GTT ATC ATG GGG ATT CTT CTA CCC CCC ACC ATC 
En Pro Sy Ly. Val lie Met Gly lie Leu Leu Pro Pro Thr U. 

680 Dos 

TTG TTT TTG GAA TTT CCC ACA TAT GAT GAT TTC TOG TAT CAA ACA TCC 
Su Glu Phe Arg Thr Tyr A.p A.p Phe Ser Tyr Gin Thr S.r 



695 



aac CAA AAC GAG GAT GCC AAA GAA AAA GAA GAG GAA AAT ACC GAT OCA 
£. ™ £n A.p Gly Ly. Clu Ly. Glu Glu Glu A.n Thr A. P Al. 

710 715 720 

__. - XT ggc TC A AGA AAG GOC CAT GAG GAG AAC GAG CAT AAA 

J£n a2 a£p Ata oly I« £g I*. Oly A.p Glu Glu A.n Glu Hi. Ly. 

725 730 '• 3: » 

aaa CAC AGA AGT ATT CCC ATC GGA ACA AAG ATC TGT GAA TTC TAT AAC 
Lys Gin Arg Ser SS Pro He Oly Thr Ly. II. Cy. Glu Phe Tyr A.» 
740 745 

GCC CCC ATT GTC AAG TTC TGC TTT TAC ACA ATA TCA TAC TTO GCC TAC 
Sa Sro iS 55 Ly. Phe Trp Phe Tyr Thr He Ser Tyr Leu Oly Ty* 

760 765 

~*v. r~rv r-rr TTT AAC TAC GTC ATC CTG GTG CGG ATG GAT GGC TGG COG 
S SS IS ilS ™ S5 II. Leu val Arg Met A.p Oly Trp Pro 



775 



TCC CTC CAC GAG TGG ATC GTC ATC TCC TAC ATC GTG AGC CTG CCC TTA 
Sr Glu Trp He Val lie Ser Tyr He Val Ser Leu Ala Leu 

790 795 
CAC AAG ATA CGA GAG ATC CTC ATG TCA GAA CCA GCC AAA CTC ACC CAC 
I™ £ya 52 SS G?u lie Leu Met Ser Glu Pro Gly Ly. Leu Ser Gl» 
805 810 

AAA ATC AAA GTT TCC CTT CAC GAG TAC TOO AAC ATC ACA GAT CTC GTO 
if J£ ?" £p Leu Gin Clu Tyr Trp A.n He Thr Aep Leu Val 

820 825 



2130 



2178 



2226 



2274 



2322 



2370 



24X8 



2466 



2514 



2562 



2610 



2658 



2706 



2754 



2802 



2850 
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OCC ATT TCC ACA TTC ATG ATT GOA GCA ATT CTT CGC CTA CAG AAC GAG 2898 
Ala II Ser Thr Ph Met II Gly Ala lie Leu Arg Leu Gin Asn Gin 
840 845 850 

CCC TAC ATG GGC TAT GGC CGG GTG ATC TAC TGT GTG GAT ATC ATC TTC 2946 
Pro Tyr Met Gly Tyr Gly Arg Val He Tyr Cys Val Asp He He Phe 
855 360 B65 

TGG TAC ATC CGT GTC CTG GAC ATC TTT GGT GTC AAC AAG TAT CTG GGG 2994 
Trp Tyr He Arg Val Leu Asp He Phe Gly Val Asn Lye Tyr Leu Gly 
870 875 880 

CCA TAC GTG ATG ATG ATT GGA AAG ATG ATG ATC GAC ATG CTG TAC TTT 3042 
Pro Tyr Val Met Met He Gly Lye Met Met He Asp Met Leu Tyr Phe 
885 890 695 

GTG GTC ATC ATG CTG GTC GTG CTC ATG AGT TTC GGA GTA GCC CGT CAA 3090 
Val Val He Met Leu Val Val Leu Met Ser Phe Gly Val Ala Arg Gin 
900 90S 910 915 

GCC ATT CTG CAT CCA GAG GAG AAG CCC TCT TGG AAA CTG GCC CGA AAC 3138 
Ala He Leu Hie Pro Glu Glu Lys Pro Ser Trp Lys Leu Ala Arg Asn 
920 925 930 

ATC TTC TAC ATG CCC TAC TGG ATG ATC TAT GGA GAG GTG TTT GCA GAC 3186 
He Phe Tyr Met Pro Tyr Trp Met He Tyr Gly Glu Val Phe Ala Asp 
935 940 945 

CAG ATA GAC CTC TAC GCC ATG GAA ATT AAT CCT CCT TGT GGT GAG AAC 3234 
Gin He Asp Leu Tyr Ala Met Glu He Aan Pro Pro Cys Gly Glu Asn 
950 955 960 

CTA TAT GAT GAG GAG GGC AAG CGG CTT CCT CCC TGT ATC CCC GGC GCC 3282 
Leu Tyr Asp Glu Glu Gly Lys Arg Leu Pro Pro Cys He Pro Gly Ala 
965 970 975 

TCC CTC ACT CCA GCA CTC ATG GCG TGC TAT CTA CTG GTC GCC AAC ATC 3330 
Trp Leu Thr Pro Ala Leu Met Ala Cys Tyr Leu Leu Val Ala Asn He 
980 985 990 995 

CTG CTC GTG AAC CTG CTG ATT GCT GTG TTC AAC AAT ACT TTC TTT GAA 3378 
Leu Leu Val Asn Leu Leu He Ala Val Phe Asn Asn Thr Phe Phe Glu 
1000 1005 1010 

GTA AAA TCA ATA TCC AAC CAG GTG TGG AAG TTC CAG CGA TAT CAG CTG 3426 
Val Lys Ser He Ser Asn Gin Val Trp Lys Phe Gin Arg Tyr Gin Leu 
1015 1020 1025 

ATT ATG ACA TTT CAT GAC AGG CCA GTC CTG CCC CCA CCG ATG ATC ATT 3474 
He Met Thr Phe His Aap Arg Pro Val Leu Pro Pro Pro Met He He 
1030 1035 1040 

TTA AGC CAC ATC TAC ATC ATC ATT ATG CGT CTC AGC GGC CGC TGC AGG 3522 
Leu Ser His He Tyr He He He Met Arg Leu Ser Gly Arg Cys Arg 
1045 1050 1055 

AAA AAG AGA GAA GGG GAC CAA GAG GAA CGG GAT CGT GGA TTG AAG CTC 3570 
Lys Lys Arg Glu Gly Asp Gin Glu Glu Arg Asp Arg Gly Leu Lys Leu 
1060 1065 1070 1075 

TTC CTT AGC GAC GAG GAG CTA AAG AGG CTG CAT GAG TTC GAG GAG CAG 3618 
Ph Leu Ser Asp Glu Glu Leu Lys Arg Leu His Glu Phe Glu Glu Gin 
1080 1085 1090 
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TGC GTG CAG GAG CAC TTC CGG GAG AAG CAG GAT GAG CAG CAG TOG TCC 
Cys Val Gin Glu His Phe Arg Glu Lye Glu Aop Glu Gin Gin S r Ser 
1 1095 1100 1105 

AGC GAC GAG CGC ATC CGG GTC ACT TCT GAA AGA GTT GAA AAT ATG TCA 
S«r Asp Glu Arg lis Arg Val Thr Ser Glu Arg Val Glu Asn Met: Ser 
* XHO 1115 H20 

ATG AGG TTG GAA GAA ATC AAT GAA AGA GAA ACT TTT ATG AAA ACT TCC 
Met Arg Leu Glu Glu He Asn Glu Arg Glu Thr Phe Met Lys Thr Ser 
1125 1130 1135 

CTG CAG ACT GTT GAC CTT CGA CTT GCT CAG CTA GAA GAA TTA TCT AAC 
Leu Gin Thr Vml Asp Leu Arg Leu Ala Gin Leu Glu Glu Leu Ser Asn 
H40 H45 H50 H55 

AGA ATG GTG AAT GCT CTT GAA AAT CTT GCG GGA ATC GAC AGG TCT GAC 
Arg Met Val Asn A J* Q Leu Glu ABn Lou 5£| 5 Cly IlB ASP 1170*"* 

CTG ATC CAG GGA CGG TCC CGG GCT TCT TCT GAA TGT GAG GCA ACG TAT 
Leu He Gin Ala Arg ser Arg Ala Ser Ser Glu Cys Glu Ala Thr Tyr 
H75 1180 1185 

CTT CTC CGG CAA AGC AGC ATC AAT AGC GCT GAT GGC TAC AGC TTG TAT 
Leu Leu Arg Gin ser Ser He Asn Ser Ala Asp Gly Tyr Ser Leu Tyr 
1190 1195 1200 

CGA TAT CAT TTT AAC GGA GAA GAG TTA TTA TTT GAG GAT ACA TCT CTC 
Ara Tvr His Phe Asn Gly Glu Glu Leu Leu Phe Glu Asp Thr Ser Lew 
* 1205 1210 1215 

TCC ACG TCA CCA GGG ACA GGA GTC AGG AAA AAA ACC TGT TCC TTC CGT 

Ser Thr Ser Pro Gly Thr Gly Val Arg Lys Lys Thr Cys Ser Phe Arg 
1220 1225 1230 1235 

ATA AAG GAA GAG AAG GAC GTG AAA ACG CAC CTA GTC CCA CAA TGT CAG 
He Lys Glu Glu Lys Asp Val Lys Thr His Leu Val Pro Glu Cys Gin 
1240 1245 1250 

AAC ACT CTT CAC CTT TCA CTG GGC ACA AGC ACA TCA GCA ACC CCA GAT 
Asn Ser Leu His Leu Ser Leu Gly Thr Ser Thr Ser Ala Thr Pro Asp 
1255 1260 1265 

GGC ACT CAC CTT GCA GTA GAT GAC TTA AAG AAC GCT GAA GAG TCA AAA 
Gly Ser His Leu Ala Val Asp Asp Leu Lys Asn Ala Glu Glu Ser Lye 
1270 1275 1280 

TTA GGT CCA GAT ATT GGG ATT TCA AAG GAA GAT GAT GAA AGA CAG ACA 
Leu Gly Pro Asp He Gly He Ser Lys Glu Asp Asp Glu Arg Gin Thr 
1285 1290 1295 

GAC TCT AAA AAA GAA GAA ACT ATT TCC CCA AGT TTA AAT AAA ACA GAT 
Asp Ser Lys Lys Glu Glu Thr He Ser Pro Ser Leu Asn Lys Thr Asp 
130O 1305 1310 I*" 

GTG ATA CAT GGA CAG GAC AAA TCA GAT CTT CAA AAC ACT CAG CTA ACA 
Val He His Gly Gin Asp Lys Ser Asp Val Gin Asn Thr Gin ^~ n Thr 
1320 1325 133G 

GTG GAA ACG ACA AAT ATA GAA GGC ACT ATT TCC TAT CCC CTG GAA CAA 
Val Glu Thr Thr Asn He Glu Gly Thr He ser Tyr Pr Leu Glu Glu 
1335 1340 1345 



3666 



3714 



3762 



3810 



3858 



3906 



3954 



4002 



4050 



4098 



4146 



4194 



4242 



4290 
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ACC AAA ATT ACA CGC TAT TTC CCC GAT GAA AOG ATC AAT GCT TGT AAA 4434 
Thr Lya lie Thr Arg Tyr Phe Pro Asp Glu Thr lie Aan Ala Cya Lye 
1350 1355 1360 

ACA ATG AAG TCC AGA AGO TTC GTC TAT TCC CGG GGA AGA AAG CTG GTC 4482 
Thr Met Lye Ser Arg Ser Phe Val Tyr Ser Arg Gly Arg Lys Leu Val 
1365 1370 1375 

GGT GGG GTT AAC GAG GAT GTA GAG TAC AGT TCA ATC ACC GAC CAG GAA 4530 
Gly Gly Val Aan Gin Asp Val Glu Tyr Ser Ser lie Thr Asp Gin Gin 
1380 1385 1390 1395 

TTG ACG ACQ GAA TGG CAA TGC CAA GTT GAA AAG ATC ACG CGC TCT CAT 4578 
Leu Thr Thr Glu Trp Gin Cya Gin Val Gin Lys He Thr Arg Ser Hia 
1400 1405 1410 

AGC ACA GAT ATT CCT TAC ATT GTG TCG GAA GCT GGA GTG CAA GCT GAG 4626 
Ser Thr Aap He Pro Tyr Xle Val Ser Glu Ala Ala Val Gin Ala Glu 
1415 1420 1425 

GAA AAA GAG CAG TTT GCA GAT ATG CAA GAT GAA GAC GAT GTC GCT GAA 4674 
Gin Lya Glu Gin Phe Ala Aap Met Gin Aap Glu Hia Hia Val Ala Glu 
1430 1435 1440 

GCA ATT CCT CCA ATC CCT CGC TTG TCC CTA ACC ATT ACT GAC AGA AAT 4722 
Ala lie Pro Arg He Pro Arg Leu Ser Leu Thr He Thr Aap Arg Aan 
1445 1450 1455 

GGG ATG GAA AAC TTA CTG TCT GTG AAG CCA GAT CAA ACT TTG GGA TTC 4770 
Gly Met Glu Aan Leu Leu Ser Val Lya Pro Aap Gin Thr Leu Gly Phe 
1460 1465 1470 1475 

CCA TCT CTC AGG TCA AAA AGT TTA CAT GGA CAT CCT AGG AAT GTG AAA 4818 
Pro Ser Leu Arg Ser Lya Ser Leu Hia Gly Hia Pro Arg Aan Val Lya 
1480 1485 1490 

TCC ATT CAG GGA AAG TTA GAC AGA TCT GGA CAT GCC AGT AGT GTA AGC 4866 
Ser He Gin Gly Lye Leu Aap Arg Ser Gly Hia Ala Ser Ser Val Ser 
1495 1500 1505 

AGC TTA GTA ATT GTG TCT GCA ATG ACA GCA GAA GAA AAA AAG GTT AAG 4914 
Ser Leu Val He Val Ser Gly Met Thr Ala Glu Glu Lya Lya Val Lya 
1510 1515 1520 

AAA GAG AAA GCT TCC ACA GAA ACT GAA TGC T AGTCTGTTTT GTTTCTTTAA 4965 
Lya Glu Lya Ala Ser Thr Glu Thr Glu Cya 
1525 1530 

T TT TT TTTTT TAACAGTCAG AAACCCACTA ATGGGTGTCA TCTTGGCCCA TCCTAAACAC 5025 

ATMTCCAATT TCCTAAAAAC ATTTTCCCTT 5055 



(2) INFORMATION FOR SBQ ID NO: 9: 

(i) SEQUENCE CHARACTERISTICS t 

(A) LENGTH: 1533 amino acida 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: pr t in 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 
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Mat Tyr lie Arg Val Ser Tyr Asp Thr Lye Pr Aap Ser Lau Lau Hia 

5 10 15 



1 



Lau Mat Val Lye Aap Trp Gin Lau Clu Lau Pro Lya Lau Lau Ila Sar 
20 25 30 

Val Hia Oly Cly Lau Gin Aan Pha Glu Mat Gin Pro Lya Lau Lya Gin 
35 40 45 

Val Pha Gly Lya Gly Lau Ila Lya Ala Ala Mat Thr Thr Gly Ala Trp 

50 55 60 

Ila Pha Thr Gly Gly Val Sar Thr Gly Val Ila Sar Hia Val Gly Aap 
65 70 75 SO 

Ala Lau Lya Aap Hia Sar Sar Lya Sar Arg Cly Arg Val Cya Ala Ila 
85 90 95 

Gly Ila Ala Pro Trp Gly Ila Val Glu Aan Lya Glu Aap Lau Val Gly 
100 105 HO 

Lya Aap Val Thr Arg Val Tyr Gin Thr Met Sar Aan Pro Lau Sar Lya 
115 120 125 

Lau Sar Val Lau Aan Aan Sar Hia Thr Hia Pha lie Leu Ala Aap Aan 
130 135 140 

Gly Thr Lau Cly Lya Tyr Gly Ala Glu Val Lyo Leu Arg Arg Lau Lau 
1<5 150 155 ISO 

Glu Lya Hie He Sar Lau Gin Lya Ila Aan Thr Arg Lau Gly Gin Gly 
1 165 170 175 

Val Pro Lau Val Gly Lau Val Val Glu Gly Gly Pro Aan Val Val Sear 
180 185 190 

Ila Val Lau Glu Tyr Leu Gin Glu Glu Pro Pro He Pro Val Val II© 
195 200 205 

Cya Aap Gly Ser Gly Arg Ala Sar Aap Ila Lau Sar Pha Ala Hia Lya 
210 215 220 

Tyr Cya Glu Glu Gly Gly He He Aan Glu Sar Lau Arg Glu Gin Lau 
225 230 235 240 

Lau Val Thr He Gin Lya Thr Pha Aan Tyr Aan Lya Ala Gin Sar Hia 
245 250 255 

Gin Lau Phe Ala He He Met Glu Cya Met Lya Lye Lya Glu Lau Val 
260 265 270 

Thr Val Phe Arg Mat Gly Ser Glu Gly Gin Gin Aap Ila Glu Mat Al« 
275 280 285 

He Leu Thr Ala Leu Leu Lya Gly Thr Aan Val Ser Ala Pro Aap Gl» 
290 295 300 

Lau Sar Lau Ala Lau Ala Trp Aan Arg Val Aap Ila Ala Arg Sar Gl» 

305 310 315 32© 

H Ph Val Ph Gly Pro Hia Trp Thr Pro Lau Gly S r Leu Ala Pro 
325 330 335 
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Pro Thr Asp Sor Lys Ala Thr Glu Lys Glu Lys Lys Pr Pro M t Ala 
340 345 350 

Thr Thr Lys Oly Gly Arg Gly Lys Gly Lys Gly Lys Lys Lys Gly Lys 
355 360 365 

Val Lys Glu Glu Val Glu Glu Glu Thr Asp Pro Arg Lys He Glu Leu 
370 375 380 

Leu Asn Trp Val Asn Ala Leu Glu Gin Ala Met Leu Asp Ala Leu Val 
385 390 395 400 

Leu Asp Arg Val Asp Phe Val Lys Leu Leu He Glu Asn Gly Val Asn 
405 410 415 

Met Gin His Phe Leu Thr He Pro Arg Leu Glu Glu Leu Tyr Asn Thr 
420 425 430 

Arg Leu Gly Pro Pro Asn Thr Leu His Leu Leu Val Arg Asp Val Lys 
435 440 445 

Lys Ser Asn Leu Pro Pro Asp Tyr His He Ser Leu He Asp He Gly 
450 455 460 

Leu Val Leu Glu Tyr Leu Met Gly Gly Ala Tyr Arg Cys Asn Tyr Thr 
465 470 475 480 

Arg Lys Asn Phe Arg Thr Leu Tyr Asn Asn Leu Phe Gly Pro Lys Arg 
485 490 495 

Pro Lys Ala Leu Lys Leu Leu Gly Met Glu Asp Asp Glu Pro Pro Ala 
500 505 510 

Lys Gly Lys Lys Lys Lys Lye Lys Lys Lys Glu Glu Glu He Asp He 
515 520 525 

Asp Val Asp Asp Pro Ala Val Ser Arg Phe Gin Tyr Pro Phe His Glu 
530 535 540 

Leu Met Val Trp Ala Val Leu Met Lys Arg Gin Lys Met Ala Val Phe 
545 550 555 560 

Leu Trp Gin Arg Gly Glu Glu Ser Met Ala Lys Ala Leu Val Ala Cys 
565 570 575 

Lys Leu Tyr Lys Ala Met Ala His Glu Ser Ser Glu Ser Asp Leu Val 
580 585 590 

Asp Asp He Ser Gin Asp Leu Asp Asn Asn Ser Lys Asp Phe Gly Gin 
595 600 605 

Leu Ala Leu Glu Leu Leu Asp Gin Ser Tyr Lys His Asp Glu Gin He 
610 615 620 

Ala Met Lys Leu Leu Thr Tyr Glu Leu Lys Asn Trp Ser Asn Ser Thr 
625 630 635 640 

Cys Leu Lys Leu Ala Val Ala Ala Lys His Arg Asp Phe He Ala His 
645 650 655 

Thr Cye Ser Gin Met Leu Leu Thr Asp Met Trp Met Gly Arg Leu Arg 
660 665 670 
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Met Arc Lye Aen Pr Gly Leu Lye Val U M t Gly lie Leu Leu Pro 
675 680 685 

Pro Thr lie Leu Phe Leu Glu Phe Arg Thr Tyr A«p Aep Phe Ser Tyr 
690 695 700 

Gin Thr ser Lye Glu Aen Glu A.p Gly Lye Glu Lye Glu Glu Glu Aen 

705 710 715 72U 

Thr A«p Ala Aen Ala A.p Ala Gly Ser Arg Lye Gly A.p Glu Glu Aen 



725 



Glu Hie Lys Lye Gin Arg Ser lie Pro He Gly Thr Lye lie Cye Glu 
740 745 750 

Phe Tyr Aen Ala Pro He Val Lye Phe Trp Phe Tyr Thr He Ser Tyr 
755 760 765 

Leu Glv Tyr Leu Leu Leu Phe Aen Tyr Val He Leu Val Arg Met Aep 
770 775 780 

Gly Trp Pro ser Leu Gin Glu Trp He Val lie ser Tyr He Val Ser 
785 790 795 

Leu Ala Leu Glu Lye He Arg Glu He Leu Met Ser Glu Pro Gly Lye 
805 810 8X5 

Leu Ser Gin Lye He Lye Val Trp Leu Gin Glu Tyr Trp Aen He Thr 
820 825 830 

Aep Leu Val Ala He Ser Thr Phe Met He Gly Ala He Leu Arg Leu 
835 840 845 

Gin Aen Gin Pro Tyr Met Gly Tyr Gly Arg Val lie Tyr cy. Val Aep 
850 855 860 

He lie Phe Trp Tyr He Arg Val Leu Asp Il« Phe Gly Val Aen Lye 
865 870 875 oov 

Tvr Leu Gly Pro Tyr Val Met Met He Gly Lye Met Met He Aep Met 

* 885 890 895 

Leu Tyr Phe Val Val He Met Leu Val Val Leu Met Ser Phe Gly Val 



900 



Ala Arg Gin Ala He Leu Hie Pro Glu Glu Lya Pro Ser Trp Lye Leu 
915 920 9*5 

Ala Arg Aen He Phe Tyr Met Pro Tyr Trp Met He Tyr Gly Glu Val 
930 935 940 

Phe Ala Aep Gin He Aep Leu Tyr Ala Met Glu He Aen Pro Pro Cy- 
945 950 955 96G 

Gly Glu Aen Leu Tyr Aep Glu Glu Gly Ly. Arg Leu Pro Pro Cye II* 

965 970 »' 5 

Pro Gly Ala Trp Leu Thr Pro Ala Leu Met Ala Cye Tyr Leu Leu Val 

980 985 9»° 

Ala Aen II L u Leu Val Aan Leu Leu He Ala val Ph Aen Aen Thr 
995 1000 1005 
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Phe Pho Clu Val Lys Ser II Ser Asn Gin Val Trp Lye Ph Gin Arg 
1010 1015 1020 

Tyr Gin Leu lie Met Thr Phe Hie Asp Arg Pro Val Leu Pro Pr Pro 
1025 1030 1035 1040 

Met He He Leu Ser Hie lie Tyr He He He Met Arg Leu Ser Gly 
1045 1050 1055 

Ara Cys Ara Lye Lye Arg Glu Gly Asp Gin Glu Glu Arg Asp Arg Gly 
1060 1065 1070 

Leu Lvs Leu Phe Leu Ser Asp Glu Glu Leu Lys Arg Leu His Glu Phe 
1075 1080 1085 

Glu Glu Gin cys Val Gin Glu His Phe Arg Glu Lys Glu Asp Glu Gin 
1090 1095 1100 

Gin Ser Ser Ser Asp Glu Arg He Arg Val Thr Ser Glu Arg Val Glu 
1105 HIO i115 1120 

Aen Met Ser Met Arg Leu Glu Glu He Asn Glu Arg Glu Thr 

1125 1130 H35 

Lvs Thr Ser Leu Gin Thr Val Asp Leu Arg Leu Ala Gin Leu Glu Glu 
* 1140 H45 1150 

Leu Ser Asn Arg Met Val Asn Ala Leu Glu Asn Leu Ala Gly He Asp 

1155 H60 1165 

Arg Ser Asp Leu He Gin Ala Arg Ser Arg Ala Ser Ser Glu Cys Glu 
1170 H75 1180 

Ala Thr Tyr Leu Leu Arg Gin Ser Ser He Asn Ser Ala Asp Gly Tyr 
1185 H90 H95 1200 

Ser Leu Tyr Arg Tyr His Phe Asn Gly Glu Glu Leu Leu Phe Glu Asp 
1205 1210 1215 

Thr Ser Leu Ser Thr Ser Pro Gly Thr Gly Val Arg Lys Lys Thr Cys 
1220 1225 1230 

Ser Phe Arg He Lys Glu Glu Lys Asp Val Lys Thr His Leu Val Pro 
1235 1240 1245 

Glu Cys Gin Asn Ser Leu His Leu Ser Leu Gly Thr Ser Thr Ser Ala 
1250 1255 1260 

Thr Pro Asp Gly Ser His Leu Ala Val Asp Asp Leu Lys Asn Ala Glu 
1265 1270 1275 1280 

Glu Ser Lys Leu Gly Pro Asp He Gly He Ser Lys Glu Asp Asp Glu 
1285 1290 1295 

Aro Gin Thr Asp Ser Lys Lys Glu Glu Thr He Ser Pro Ser Leu Asn 
1300 1305 1310 

Lvs Thr Asp Val He His Gly Gin Asp Lys Ser Asp Val Gin Asn Thr 
1 1315 1320 1325 

Gin Leu Thr Val Glu Thr Thr Asn II Glu Gly Thr He Ser Tyr Pro 
1330 1335 1340 
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Lou Clu Glu Thr Lys lie Thr Arg Tyr Phe Pro Asp Glu Thr II Asn 
1345 1350 1355 1360 

Ala Cys Lys Thr Met Lys Ser Arg Ser Phe Val Tyr Ser Arg Oly Arg 
1365 1370 1375 

Lvs Leu Val Gly Gly Val Asn Gin Asp Val Glu Tyr Ser Ser lie Thr 
1 1380 1385 1390 

Asp Gin Gin Leu Thr Thr Glu Trp Gin Cys Gin Val Gin Lys lie Thr 
1395 1400 1405 

Arg Ser His Ser Thr Asp lie Pro Tyr lie Val Ser Glu Ala Ala Val 
1410 1415 1420 

Gin Ala Glu Gin Lys Glu Gin Phe Ala Asp Met Gin Asp Glu His Hie 
1425 1430 1435 1440 

Val Ala Glu Ala lie Pro Arg lie Pro Arg Leu Ser Leu Thr lie Thr 
1445 1450 1455 

Asp Arg Asn Gly Met Glu Asn Leu Leu Ser Val Lys Pro Asp Gin Thr 
1460 1465 1470 

Leu Gly Phe Pro Ser Leu Arg Ser Lys Ser Leu His Gly His Pro Arg 
1475 1480 1485 

Asn Val Lys Ser lie Gin Gly Lys Lou Asp Arg Ser Gly His Ala Ser 
1490 1495 1500 

Ser Val Ser Ser Leu Val He Val Ser Gly Met Thr Ala Glu Glu Lye 
1505 1510 1515 1S20 

Lvs Val Lys Lys Glu Lys Ala Ser Thr Glu Thr Glu Cys 
1525 1530 
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WHAT IS CLAIMED IS: 

1. An isolated nucleic acid comprising the 
nucleotide sequence SEQ ID NO: 2 as depicted in FIG. 3 r or 
the nucleotide sequence of a gene or gene fragment 

5 contained in the following clone as deposited with the 
NRRL: pFOMY030 (NRRL accession No* B-21416) . 

2. An isolated nucleic acid which encodes the 
amino acid sequence depicted in Fig. 3 from amino acid 
residue number 1 to 542, or its complement* 



10 3. An isolated nucleic acid which hybridizes to 

an isolated nucleic acid which encodes the amino acid 
sequence depicted in Fig. 3 from amino acid residue 
number 1 to 542, or its complement. 

4. A genetically engineered host cell comprising 
15 the nucleotide sequence of Claim 1, 2, or 3. 

5. An expression vector comprising the nucleotide 
sequence of Claim 1, 2 or, 3 in operative association 
with a nucleotide sequence regulatory element that 
controls expression of the nucleotide sequence in a host 

20 cell. 

6. A substantially pure gene product encoded by 
the nucleic acid of Claim 1, 2, or 3. 

7. An isolated nucleic acid comprising the 
nucleotide sequence SEQ ID NO: 6 as depicted in FIG- 5, or 

25 SEQ ID NO: 8 as depicted in FIG. 6. 
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8. An isolated nucleic acid which ncodes the 
amino acid sequence depicted in Fig. 5 from amino acid 
residue number 1 to 1497, or its complement. 

9. An isolated nucleic acid which encodes the 
5 amino acid sequence depicted in Fig. 6 from amino acid 

residue number 1 to 1533, or its complement. 

10. An isolated nucleic acid which hybridizes to 
a nucleic acid that encodes the amino acid sequence 
depicted in Fig. 5 from amino acid residue number 1 to 

10 1497 (SEQ ID NO: 7), or its complement, or the amino acid 
sequence depicted in Fig. 6 from amino acid residue 
number 1 to 1533 (SEQ ID NO! 9) , or its complement. 

11. A genetically engineered host cell containing 
the nucleotide sequence of any one of Claims 7 to 10. 

15 12. An expression vector containing the 

nucleotide sequence of any one of Claims 7 to 10 in 
operative association with a nucleotide sequence 
regulatory element that controls expression of the 
nucleotide sequence in a host cell. 

20 13. A substantially pure gene product encoded by 

the nucleic acid of any one of Claims 7 to 10. 

14. An antibody that immunospecif ically binds the 
gene product of Claim 6 or 13. 

15. A composition for detecting expression of a 
25 gene transcript or gene product of gene 030 for use in a 

method of diagnosing tumor progression in a mammal, 
wher in a 1 vel of expression of the g n transcript or 
g ne product low r in a test sample than in a control 



WO 96/30389 



PCT/US96/04413 



- 159 - 

sample indicates a tumor progression state in the test 
sampl . 

16. A compound for use in a method of treating 
tumor progression in a mammal, wherein said compound 
5 increases the level of expression or activity of a gene 
transcript or gene product of gene 030 in cells 
exhibiting a tumor progression state. 

17 • The use of a compound for the manufacture of 
a medicament for use in a method of treating tumor 
10 progression in a mammal, wherein said compound increases 
the level of expression or activity of a gene transcript 
or gene product of gene 030 in cells exhibiting a tumor 
progression state. 

18. A compound of Claim 16 or 17, wherein the 
15 compound comprises a nucleic acid whose administration 

results in an increase in the level of the differentially 
expressed gene transcript and gene product in the cells 
in the tumor progression state, 

19. The compound of Claim 18, wherein the nucleic 
20 acid comprises the nucleic acid of any one of Claims 1 to 

3, or 7 to 10. 

20. A compound of Claim 16 or 17, wherein the 
compound is a 030 gene product. 

21. A compound comprising a normal allele of a 
25 030 gene for use in a method for inhibiting tumor 

progression in a mammal, wherein the normal allele 
expresses a normal gene product thereby inhibiting a 
tumor progression state. 
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22. A compound of Claim 16, 17 , or 21, wherein 
the tumor progression state is neoplasia. 

23. A compound of Claim 16, 17, or 21, wherein 
the tumor progression state is metastasis* 

5 24. A composition for detecting expression of a 

gene transcript or gene product of gene 030 for use in a 
method of monitoring the efficacy of a compound in 
clinical trials for inhibition of tumor progression, 
wherein a level of expression of the gene transcript or 
10 gene product lower in a test sample treated with said 
compound than in a control sample indicates that the 
compound is effective to inhibit tumor progression. 
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CTC CCC TCA CACCCAGTGCAGGGCCTCCTAGC IU'ICACATCCAGCCATTCACATCGCCTtrrCA^jGACTGGGCCAGGC 2018 

L p * (SEQ ID NO: 3) 

ACTCCAACCTCTCAAGCATGll. 1 LL'1VLLL W 1XJLUMLLU' A CAATAGCCCCCCCT C TG GGC CCCATCC L '' 1 ' L ' I\JU ' 1 1 L lUI ' C 2098 

ACCCCGTTCTCCTCCCXACICATCACTGGCCC'HICTCVIU IX.l - IUC AAOGCAAGaA^CAJ^W^AGC AIt- nT I TCCC C 2178 

ACAAAACrrrrAGGGCTCCCCGLlUn 2258 

CAArcC TTCCCCACCCTATCCCAATAGACCCTAT^ 2338 

TTCTTCCTCCACCCTCTTAAACCTCCAATAAATCTCAC^ t LACTCTCACTCCACALr'l "1'IUIAC 2418 

ACACACC CACAC C C T CG T' llUUTl t-T ACTTACAAOGAAGG L ITllWIXriVlVIin C L ri^LL- ^ACTCA Criirilr ' nU AC 2498 

AGAAGCACTTTCCACATX^AAAGCGTTCACTCACCTGCA'ltj 1 llATTAATTAATACTCATACAAA^TATTGACACTTCT 2 57 8 

it^TTCT ri^ n'rATGC AOCCGAAAGCACTTAAC C 1 ^ 11/HjGC ACTTT 2658 

ATTTACTATATATAC C n - lUTC A7T C TOT\ X AAGCAAAAATT^A A^^ 27 30 

(SEQ ID NO: 2) 
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ACTCATrATAGCC3m:C^ 

AATCTTGGATAGAGAAAACCTTTTGCJJ*^^ 

L ' lWXJ GCCASTTCJ^CAACCaOCATATC 

GTGGAGACIX^^CTGAGAAAT ^ 

CCAGGGTGGCGGATATTCCAATAAAGCC 

M Y : R V S Y D T K ? D 5 L L H L M V K 20 

ATG TAT ATC CGT GTA TCC TAT GAC ACC AAG CCA GAC TCA CTG CTC CAT CTC ATG GTG AAA 60 

D W Q 1, E L P K L L I S V H 3 G L ^ Iv F 40 

GAT TGG CAG CTG GAA CTC CCC AAG CTC TTA ATA TCT GTG CAT GGA GGC CTC CAG AAC TTT 120 

EM QPKLKQVFGKGltlEAAMT 60 

GAG ATG CAG CCC AAG CTG AAA CAA CTC TTT GGG AAA GGC CTG ATC AAG GCT GCT ATG ACC 180 

TGAWI FTGGVSTGVISHVGD 80 

ACC GGG GCC TGG ATC TTC ACC GGG GGT GTC AGC ACA GGT GTT ATC AGC CAC GTA GGG GAT 240 

AI*KDK SSK3RGRVCAIGIAP 100 

GCC TTG AAA GAC CAC TCC TCC AAG TCC AGA GGC CGG GTT TGT GCT ATA GGA ATT GCT CCA 300 

W G Z V E U K E D 1 V G K D V T r. V V Q 120 

TGG GGC ATC GTG GAG AAT AAG GAA GAC CTG GTT GGA AAG GAT GTA ACA AGA GTG TAC CAG 360 

T M 3 N P L S K L 3 V L N IJ S K T H F Z 140 

ACC ATC TCC AAC CCT CTA AGT AAG CTC TCT GTG CTC AAC AAC TCC CAC ACC CAC TTC ATC 420 

L A DNG T LG KYG A E V K L R R L L 160 

CTG GCT GAC AAT GGC ACC CTG GGC AAG TAT GGC GCC GAG GTG AAG CTG CGA AGG CTG CTG 480 

E K H I S L Q K I NT RLGQGV? L. V 180 

GAA AAG CAC ATC TCC CTC CAG AAG ATC AAC ACA AGA CTG GGG CAG GGC GTG CCC CTC GTG 540 

GLVVEGGPNVVSIVLEYLOE 200 

GGT CTC GTG GTG GAG GGG GGC CCT AAC GTG GTG TCC ATC GTC TTG GAA TAC CTG CAA GAA 600 

EPPIPVV ICDGSGRASDILS 220 

GAG CCT CCC ATC CCT GTG GTG ATT TGT GAT GGC AGC GGA CGT GCC TCG GAC ATC CTG TCC 660 

" A H K Y C E E Z G ; Z V E 3 L ?. E ; - 240 

TTT GCG CAC AAG TAC TGT GAA GAA GGC GGA ATA ATA AAT GAG TCC CTC AGG GAG CAG CTT 720 

L V T I Q K T F :: Y U K A Q S H Q L F A 260 

CTA GTT ACC ATT CAG AAA ACA TTT AAT TAT AAT AAG GGA CAA TCA CAT CAG CTG TTT GCA 780 

I IMECMKKKELVTVFRMGS E 280 

ATT ATA ATG GAG TCC ATG AAG AAG AAA GAA CTC GTC ACT GTG TTC AGA ATG GGT TCT GAG 840 

GQQDlEMAILTAwLKGTNVS 300 

CMC CAG CAG GAC ATC GAG ATG GCA ATT TTA ACT GCC CTG CTG AAA GGA ACA AAC GTA TCT 900 

A P D Q L S L A 1AWNRVDI A R S Q 320 

GCT CCA GAT CAG CTG AGC TTG GCA CTG GCT TGG AAC CGC GTG GAC ATA GCA CGA AGC CAG 960 

1FVFGPHWTPLGSLAPPTDS 340 

ATC TTT GTC TTT GGG CCC CAC TGG ACG CCC CTG GGA AGC CTG GCA CCC CCG ACG GAC AGC 1020 

KATEKEKK PPMATTKGGRG K 360 

AAA GCC ACG GAG AAG GAG AAG AAG CCA CCC ATG GCC ACC ACC AAG GGA GGA AGA GGA AAA 1080 

GKGKKKGKVKSEVEEETDPR 380 

GGG AAA GGC AAG AAG AAA GGG AAA GTG AAA GAG GAA GTG GAG GAA GAA ACT GAC CCC CGG 1140 

K I E L L N W \ f K A - E Q A M L D A L V 400 
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" G •* L. S v 
CCA GGC AAA CTC AGC CAG AAA 
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T CAG GAG TAC TGG AAC ATC ACA GAT CTC GTG GCC ATT TCC ACA TTC 

A T /. . - - -» * ■ • - * • - - / • * 

ATG ATT GGA GCA ATG GCC ACG ASA TCT GTG ATG ATG ATT GGA AAG ATG ATG ATC GAC ATG 

L V " V V : M L V V L M 2 F 

era tac ttt gtg arc atc atg cts gtc gtg ctc atg a™ ttc 



A F. 0 A 
A GTA Ore CGT CAA GCC 



H ? 
CAT CEP 



AG AAG CTC 



.*« — » .-\ r\ 
TGG AAA CTG GCC CG? 



"MP 
TAC ATG CCC 



V v; :•: r v z z v t a d q : d - a m - x 

TAC TGG ATG ATC TAT GGA GAG GTG TTT GCA GAC CAG ATA GAC CTC TAC GCC ATG GAA ATT 



:; ? ? C G £ 13 1 V D E - ^ - " 
AAT CCT CCT T3T GGT GAG AAC CTA TAT GAT GAG GAG GGC AAG CGG CT^ 



TGG 



? ? Z I 

:cr ccc tct atc 

T ? ALMA C V - - V A ! I I L 
:C ACT CCA GCA CTC ATG GCG TGC TAT CTA CTG GTC GCC AAC ATC CTG 



GTG 



:i c v w ?: r o ?. v ? l : m t f h d r ? v 

AAC CAG GTG TGG AAG TTC CAG CGA TAT CAG CTG ATT ATG ACA TTT CAT GAC AGG CCA GTC 



lpp?m:tlsh:y: : : m r l s g 

CTC CCC CCA CCG ATG ATC ATT TTA AGC CAC ATC TAC ATC ATC ATT ATG CGT CTC AGC GGC 



r c r k :-: 

CGC TGC AGG AAA AAC 



REGDQE'e?. D R G L K I- F 
AGA GAA GGG GAC CAA GAG GAA CGG GAT CGT GGA TTG AAG - -C TTC 



LSDEELKF. LHEFSE3CVQEH 
CTT AGC GAC GAG GAG CTA AAG AGG CTG CAT GAG TTC GAG GAG CAG TGC GTG CAG GAG CAC 

TTC CGG GAG AAG GAG GAT GAG CAG CAG TCG TCC AGC GAC GAG CGC ATC CGG GTC ACT TCT 

r r l e z : e r e t r k 

GAA AGA GTT GAA AAT ATG TCA ATG AGG TTG GAA GAA ATC AAT GAA AGA GAA ACT TTT ATG 

KTSLQTYDLRLA?LEELSUR 
AAA ACT TCC CTG CAG ACT GTT GAC CTT CGA CTT GCT CAG CTA GAA GAA TTA TCT AAC AGA 

MVN A L EN LAG I D r. SDL IQ AR 
ATG GTG AAT GCT CTT GAA AAT CTT GCG GGA ATC GAC AGG TCT GAC CTG ATC CAG GCA CGG 



SF. A -SSECEATYLLR, QSSINS 
TCC CGG GCT TCT TCT GAA TGT GAG GCA AGG TAT CTT CTC CGG CAA AGC AGC ATC AAT AGC 



ADGYSLYRVHFHGEELLFED 
GCT GAT GGC TAC AGC TTG TAT CGA TAT CAT TTT AAC GGA GAA GAG TTA TTA TTT GAG GAT 

TS^STSPGTGVRKKTCSFRI 
ACA TCT CTC TCC ACG TCA CCA GGG ACA GGA GTC AGG AAA AAA ACC TGT TCC TTC CGT ATA 

K E E K 3 V K T H X- V P E C Q H S L H L 
AAG GAA GAG AAG GAC GTG AAA ACG CAC CTA GTC CCA GAA TGT CAG AAC ACT CTT CAC CTT 

3 L GTSTSATPCGSHLAVDDL 
TCA CTG GGC ACA AGC ACA TCA GCA ACC CCA GAT GGC AGT CAC CTT GCA GTA GAT GAC TTA 
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* " A E E 3 K PD I G I S K ED DE 1260 

AAG AAC GCT GAA GAG TCA AAA TTA GGT CCA GAT ATT GGG ATT TCA AAG GAA GAT GAT GAA 3780 

B *? TDSK KEETISPSLNKTDV 1280 

AGA CAG ACA GAC TCT AAA AAA GAA GAA ACT ATT TCC CCA ACT TTA AAT AAA ACA GAT GTG 3840 

JL „" ° Q ° K S DVQNTQLTVETTN 1300 

ATA Cr»T GGA CAG GAC AAA TCA GAT GTT CAA AAC ACT CAG CTA ACA GTG GAA ACG ACA AAT 3900 

J, EGTI SYPLEETKITRVFPD 1320 

ATA GAA GGC ACT ATT TCC TAT CCC CTG GAA GAA ACC AAA ATT ACA CGC TAT TTC CCC GAT 3960 

ETINA CK TMKSR3FVYSRGR "40 

GAA ACG ATC AAT GCT TGT AAA ACA ATG AAG TCC AGA AGC TTC GTC TAT TCC CGG GGA AGA 4020 

*JL VGGVN ODVEYSSITDQQL 1360 

AAG CTG GTC GGT GGG GTT AAC CAG GAT GTA GAG TAC AGT TCA ATC ACG GAC CAG CAA TTG 4080 

»~ EW ^ C OVQKITRSHSTD IP 1380 

ACG ACG GAA TOG CAA TGC CAA GTT CAA AAG ATC ACG CGC TCT CAT AGC ACA GAT ATT CCT 4140 

»„ »L, JL . JL A A A V 0 A E Q K E Q F A D M Q 1400 



4200 



ATT GTG' TCG GAA GCT GCA GTG CAA GCT GAG CAA AAA GAG CAG TTT GCA GAT ATG CAA 

°^f" HVASAI p R I PRLS LT IT 1420 

GAT GAA CAC CAT GTC GCT GAA GCA ATT CCT CGA ATC CCT CGC TTG TCC CTA ACC ATT ACT 4260 

D RMGMSNLLSVKPDQTLGFP 1440 

GAC AGA AAT GGG ATG GAA AAC TTA CTG TCT GTG AAG CCA GAT CAA ACT TTG GGA TTC CCA 4320 

JL JL JL JL K 3 L H G H ? R N V K S I Q G K 1460 

TCT CTC AGG *CA AAA AGT TTA CAT GGA CAT CCT AGG AAT- GTG AAA TCC ATT CAG GGA AAG 4380 

JLJL RSGHASSV SSLVIVSGMT 14B0 

GAC AGA TCT GGA CAT GCC AGT AGT GTA AGC AGC TTA GTA ATT GTG TCT GGA ATG ACA 4440 

(i»rLrL». K V K K E K A S T E T E C (SEQ 10 NO: 7) 1497 

GAA GAA AAA AAG GTT AAG AAA GAG AAA GCT TCC ACA GAA ACT GAA TGC TAG TCTGTT 4991 

TTGTTTCrrTTAATTTTTTTTT^ 
TTCCTAAAAACATTTTCCCT (SEQ ID NOl 6) 
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AATCTTGGJVTAGAGAAAACCTTTTGCAAACGGGAATGT^ 
CTCTGGCCACTIX^CCAACC^^ATATC^^ 
CTGGAGACTCAGCCTGAGAAATGGTCT^^ 
COVGGGTOGCGGATATTCCAATAAAGCC 

M v ; » v = V D T K P D 3 L L H L M V 20 
ATG TAT A" CGT GTA TCC TAT GAC ACC AAG CCA GAC TCA CTG CTC CAT CTC ATG GTG AAA 60 

D w ; - E Z ? X L L I s v h " g g l g :z jr^ . t° 

GAT TGG CAG CTG GAA CTC CCC AAG CTC TTA ATA TCT GTG CAT GGA GGC . CAC AAC 120 



E M ; ? K - K Q V F G K 



GAG ATG 



T G A W I 

ACC GGG C-CC TGG ATC TTC ACC GGG GOT GTC AGC ACA GOT GTT ATC AGC CAC GTA GGG GA. 

A L Y. Z H 3 3 :*. S R G R V C A " G I A ?^ 

GCC TTG AAA GAC CAC TCC TCC AAG TCC AGA GGC CGG GTT TCT GCT ATA GGA A.* GCT CCA 

v: z : v z :: :-' z z z v g :-: - 7 ^ v _ ■ 

TGG GGC ATC GTG GAG AAT AAG GAA GAC CTC GTT GGA AAG GAT GTA ACA AGA GTG TAC CAC 



TTT GCG CAC AAG TAC TCT GAA GAA GGC GGA ATA ATA AAT GAG TCC CTC AGG GAw 



I I M E C M K 



ATT ATA ATG GAG TGC ATG AAG AAG AAA GAA CTC GTC ACT GTG TTC AGA ATG GGT TCT GAG 



I T V F G ? H V; 
ATC TTT GTC TTT GGG CCC CAC TGG ACG CCC CTG GGA AGC CTG GCA CCC CCG ACG GAC AGC 



60 



CCC AAG CTG AAA CAA GTC TTT GGG AAA GGC CTG ATC AAG GCT GCT ATG ACC 180 

- G G V S T G V I S K V G C BO 

240 

100 

300 



360 



T M 3 W ? Z 3 K Z S V Z « :i 3 H T H ~ I 140 

ACC ATG TCC AAC CCT CTA AGT AAG CTC TCT GTC CTC AAC AAC TCC CAC ACC CAC TTC ATC 420 

LADNGT-GKYGAEVKLRR-L 160 

CTG GCT GAC AAT GGC ACC CTG GGC AAG TAT GGC GCC GAG GTG AAG CTG CGA AGG CTG CTC 480 

ekk:s-qk:ntrlgqgvplv ibo 

GfiA AAG CAC ATC TCC CTC CAG AAG ATC AAC ACA AGA CTG GGG CAG GGC GTG CCC CTC GTG 540 

G L V V E G G P N V V S 1 V L E V Z Q E 200 

GGT CTC GTG GTG GAG GGG GGC CCT AAC GTG GTG TCC ATC GTC TTG GAA TAC CTG CAA GAA 600 

£ p p ; p •/ V I C D G S G R A S Z- I - 3 220 

GAG CCT CCC ATC CCT GTG GTG ATT TGT GAT GGC AGC GGA CCT GCC TCG GAC ATC CTG TCC 660 



:40 



260 



L V T 1 Q -\ T r K V W K A Q 3 H v 
CTA GTT ACC ATT CAG AAA ACA TTT AAT TAT AAT AAG GCA CAA TCA CAT CAG CTG TTT GCA 780 



KKELVTVFRMG3E 280 



B40 



GQQDIEMAILTALL-KGT -NV S 300 
GGC CAC CAG GAC ATC GAG ATG GCA ATT TTA ACT GCC CTG CTG AAA GGA ACA AAC GTA TCT 900 

APDQLSLALAWNRVDIARSC 320 
GCT CCA GAT CAG CTG AGC TTG GCA CTG GCT TGG AAC CGC GTG GAC ATA GCA CGA AGC CAG 960 

PLGS-APPTDS 340 

1020 



K A T E K E K K P P M A T T K G G R 3 K 360 

AAA GCC ACG GAG AAG GAG AAG AAG CCA CCC ATG GCC ACC ACC AAG GGA GGA AGA GGA. AAA 1080 

GKGKKKGKVKESVEEETOF ?. 380 

GGG AAA GGC AAG AAG AAA GGG AAA GTG AAA GAG GAA GTG GAG GAA GAA ACT GAC CCC CSC 1140 

k : ell:: v; v h a l e c a m :. c a v 400 
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AAG ATA GAG CTG CTG AAC TGG GTG AAT GCT TTG GAG CAA GCG ATG CTA GAT GCT TTA GTC 1200 

L D ?. V D F V K L L Z E U 3 V IJ M Q K F 420 

TTA GAT CGT GTC GAC TTT GTG AAG CTC CTG ATT GAA AAC GGA GTG AAC ATG CAA CAC TTT 1260 

L T I ? R L E £ L ■ V N T F. L G ? P N T - 440 

CTG ACC ATT CCG AGG CTG GAG GAG CTT TAT AAC ACA AGA CTG GGT CCA CCA AAC ACA CTT 1320 

HLLVRDVKKSMLPPDYHI5 L 460 

CAT CTG CTG GTG AGG GAT GTG AAA AAG AGC AAC CTT CCG CCT GAT TAC CAC ATC AGC CTC 1380 

ID I G L V L E Y L M G G A Y R C • N Y T 480 

ATA GAC ATC GGG CTC GTG CTG GAG TAC CTC ATG GGA GGA GCC TAC CGC TGC AAC TAC ACT 1440 

RK NFRTLYNNLFGPKRPKA - 500 

CGG AAA AAC TTT CGG ACC CTT TAC AAC AAC TTG TTT GGA CCA AAG AGG CCT AAA GCT CTT 1500 

KLLCMEDDE? P A K G K K K K K K 520 

AAA CTT CTG GGA ATG GAA GAT GAT GAG CCT CCA GCT AAA GGG AAG AAA AAA AAA AAA AAG 1560 

K K E E E ZD I D C Z ? A V S F. . F ? Y 540 

AAA AAG GAG GAA GAG ATC GAC ATT GAT GTG GAC GAC CCT GCC GTG ACT CGG TTC GAG TAT 1620 

P F H E L M V W A V L M r. R Q K M A V F 560 

CCC TTC CAC GAG CTG ATG GTG TGG GGA GTG CTG ATG AAA CGC CAG AAA ATG GGA GTG TTC 1680 

LWQRG EE SMAXALVACKLY K 5B0 

CTC TGG CAG CGA GGG GAA GAG AGC ATG GCC AAG GCC CTG GTG GCC TGC AAG CTC TAC AAG 1740 

AMAHESSESD LVDDXSQDL. D 600 

GCC ATG GCC CAC GAG TCC TCC GAG AGT GAT CTG GTG GAT GAC ATC TCC CAG GAC TTG GAT 1800 

'NNSKDFGQLALSLLDQSYKH 620 

AAC AAT TCC AAA GAC TTC GGC CAG CTT GCT TTG GAG TTA TTA GAC CAG TCC TAT AAG CAT 1860 

D E Q Z A M K L L T Y E L K II W S N S T 640 

GAC GAG CAG ATC GCT ATG AAA CTC CTG ACC TAC GAG CTG AAA AAC TGG AGC AAC TCG ACC 1920 

CLKLAVAAKH RDF Z A H 7 C £ Q 660 

TGC CTC AAA CTG GCC GTG GGA GCC AAA CAC CGG GAC TTC ATT GCT CAC ACC TGC AGC CAG 1980 

M L L T _ D K W M G R L R M R K N P G « K 680 

ATG CTG CTG ACC GAT ATG TGG ATG GGA AGA CTG CGG ATG CGG AAG AAC CCC GGC CTG AAG 2040 

VIMGI LLPPTILFLEFRTY D 700 

GTT ATC ATG GGG ATT CTT CTA CCC CCC ACC ATC TTG TTT TTG GAA TTT CGC ACA TAT GAT 2100 

OF SYQTS KE N EDGKEK EEE N 720 

GAT TTC TCG TAT CAA ACA TCC AAG GAA AAC GAG GAT GGC AAA GAA AAA GAA GAG GAA AAT 2160 

TDANADAGSRKGDEENEHK K 740 

ACG GAT GCA AAT GGA GAT GCT GGC TCA AGA AAG GGG GAT GAG GAG AAC GAG CAT AAA AAA 2220 

OR I I P IGTKI CKFY NAPIV K 760 

CAA AGA ATT ATC CCC ATC GGA ACA AAA ATC TGT AAA TTC TAT AAC GCG CCC ATT GTC AAG 2280 

F W F Y T ISYLG YLLLFNYVI L 7B0 

TTC TGG TTT TAC ACA ATA TCA TAC TTG GGC TAC CTG CTG CTG TTT AAC TAC GTC ATC CTG 2340 

VRMDGWPSLQEWIVISYIV S 800 

GTO CGG ATG GAT GGC TGG CCG TCC CTC CAG GAG TGG ATC GTC ATC TCC TAC ATC GTG AGC 2400 

LALEK IREILK'SEPGKLSw K 820 

CTG GCG TTA GAG AAG ATA CGA GAG ATC CTC ATG TCA GAA CCA GGC AAA CTC AGC CAG AAA 2460 

ma ^_^ FIG. 6(PAQE20F4) 
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X KVWLQSYWMITD1VAISTF 840 

ATC AAA CTT TGG CTT CAG GAG TAC TGG AAC ATC ACA GAT CTC GTG GCC ATT TCC ACA TTC 2520 

MIGAILRLQMQPYMGYGRVI 860 

ATG ATT GGA GCA ATT CTT CGC CTA CAG AAC CAG CCC TAC ATG GGC TAT GGC CGG GTG ATC 2580 

YCVDI I F W Y I RVLDIFGVNK 880 

TAC TGT GTG GAT ATC ATC TTC TGG TAC ATC CGT GTC CTG GAC ATC TTT GGT GTC AAC AAG 2640 

YLGPYVMKIGKMMIDMLYFV 900 

TAT CTG GGG CCA TAC GTG ATG ATG ATT GGA AAG ATG ATG ATC GAC ATG CTG TAC TTT GTG 2700 

VIMLVVLMSFGVARQAILHP 920 

GTC ATC ATG CTG GTC GTG CTC ATG AGT TTC GGA GTA GCC CGT CAA GCC ATT CTG CAT CCA 2760 

EEKPSW KLARN IFYMPYWMI 940 

GAG GAG AAG CCC TCT TGG AAA CTG GCC CGA AAC ATC TTC TAC ATG CCC TAC TGG ATG ATC 2820 

YGEVFADQIOLYAMSINPPC 960 

TAT GGA GAG GTG TTT GCA GAC CAG ATA GAC CTC TAC GCC ATG GAA ATT AAT CCT CCT TGT 2880 

GENLYDEEGKRLPPCIPGAW 980 

GGT GAG AAC CTA TAT GAT GAG GAG GGC AAG CGG CTT CCT CCC TGT ATC CCC GGC GCC TGG 2940 

LTPALMACYLLVA NILLVNL 1000 

CTC ACT CCA GCA CTC ATG GCG TGC TAT CTA CTG GTC GCC AAC ATC CTG CTG GTG AAC CTG 3000 

LIAVFNNTFFEVKSISNQVW 1020 

CTG ATT GCT GTG TTC AAC AAT ACT TTC TTT GAA GTA AAA TCA ATA TCC AAC CAG GTG TGG 3060 

KFQRYQLI MTFHDRPVL P P P 1040 

AAG TTC CAG CGA TAT CAG CTG ATT ATG ACA TTT CAT GAC AGG CCA GTC CTG CCC CCA CCC 3120 

MI ILSHIYII IMRLSGRC RK 1060 

ATG ATC ATT TTA AGC CAC ATC TAC ATC ATC ATT ATG CGT CTC AGC GGC CGC TGC AGG AAA 3180 

KRE GDQEERDRGLK^FLS DE 1080 

AAG AGA GAA GGG GAC CAA GAG GAA CGG GAT CGT GGA TTG AAG CTC TTC CTT AGC GAC GAG 3240 

ELKRLHEFEEQCVQSHFR EK 1100 

GAG CTA AAG AGG CTG CAT GAG TTC GAG GAG CAG TGC GTG CAG GAG CAC TTC CGG GAG AAG 3300 

EDEQQSSSDERIRVTSERVE 1120 

GAG GAT GAG CAG CAG TCG TCC AGC GAC GAG CGC ATC CGG GTC ACT TCT GAA AGA GTT GAA 3360 

NMSMRLEEINERETFMKT S L 1140 

AAT ATG TCA ATG AGG TTG GAA GAA ATC AAT GAA AGA GAA ACT TTT ATG AAA ACT TCC CTG 3420 

QTVDLRLAQLEELSNRMV N A 1160 

CAG ACT GTT GAC CTT CGA CTT GCT CAG CTA GAA GAA TTA TCT AAC AGA ATG GTG AAT GCT 3480 

L ENLAG I DR S DL I Q A R SR AS 1180 

CTT GAA AAT CTT GCG GGA ATC GAC AGG TCT GAC CTG ATC CAG GCA CGG TCC CGG OCT TCT 3540 

SECEATYLLRQSSINSADGY 1200 

TCT GAA TGT GAG GCA ACG TAT CTT CTC CGG CAA AGC AGC ATC AAT AGC GCT GAT GGC TAC 3600 

S LYRYHFNGEELLFEDTS L S 1220 

AGC TTG TAT CGA TAT CAT TTT AAC GGA GAA GAG TTA TTA TTT GAG GAT ACA TCT CTC TCC 3660 

TSPGTGVRKKTCSFRIKE EK 1240 

ACG TCA CCA GGG ACA GGA GTC AGG AAA AAA ACC TGT TCC TTC CGT ATA AAG GAA GAG AAG 3720 
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DVKT «^VPECQNSLHLSLGT 
GAC GTG AAA ACG CAC CTA GTC CCA GAA TGT CAG AAC AGT CTT CAC CTT TCA CTG GGC ACA 

STS ATPDQ SHLAVDDLKNAE 
AGC ACA TCA GCA ACC CCA GAT GGC AGT CAC CTT GCA GTA GAT GAC TTA AAG AAC GCT GAA 

r .L m ! KLGPDlG lSKEDDERQTD 
GAG TCA AAA TTA GGT CCA GAT ATT GGG ATT TCA AAG GAA GAT GAT GAA AGA CAG ACA GAC 

SKKEET *SPSLNKTDVIHGQ 
TCT AAA AAA GAA GAA ACT ATT TCC CCA AGT TTA AAT AAA ACA GAT GTG ATA CAT GGA CAG 

J*r* ^ DV QNTQLTVETT N I E GT 

GAC AAA TCA GAT GTT CAA AAC ACT CAG CTA ACA GTG GAA ACG ACA AAT ATA GAA GGC ACT 

& LJL YPI ' EETKl ' r RYE r PD'ETrN 
ATT TCC TAT CCC CTG GAA GAA ACC AAA ATT ACA CGC TAT TTC CCC GAT GAA ACG ATC AAT 

A,JL* KTMKSRSFVVS RGRKLVG 
GCT TGT AAA ACA ATG AAG TCC AGA AGC TTC GTC TAT TCC CGG GGA AGA AAG CTG GTC GGT 

GVN ^ D VEy SSZTDQQLT ^ :r 

GGG GTT AAC CAG GAT GTA GAG TAC AGT TCA ATC ACG GAC CAG CAA TTG ACG ACG GAA TOG 

°Jz QVQKlT *SHSTDIPYrvS 
CAA TGC CAA GTT CAA AAG ATC ACG CGC TCT CAT AGC ACA GAT ATT CCT TAC ATT GTG TCG 

EAAVQAEQKEQ FADMQDEHH 
GAA GCT GCA GTG CAA GCT GAG CAA AAA GAG CAG TTT GCA GAT ATG CAA GAT GAA CAC CAT 

JL AEA I * * I PHLSLTITDRNG 
GTC GCT GAA GCA ATT CCT CGA ATC CCT CGC TTG TCC CTA ACC ATT ACT GAC AGA AAT GGG 

"ENLLSVKPDQTLGFPSLRS 
ATG GAA AAC TTA CTG TCT GTG AAG CCA GAT CAA ACT TTG GGA TTC CCA TCT CTC AGG TCA 

KSLHGHPRNVKSrQGKLDRS 
AAA AGT TTA CAT GGA CAT CCT AGG AAT GTG AAA TCC ATT CAG GGA AAG TTA GAC AGA TCT 

GHA SSVSSLVIVSGMTAEEK 
GGA CAT-GCC AGT AGT GTA AGC AGC TTA GTA ATT GTG TCT GGA ATG ACA GCA GAA GAA AAA 

kvkkekastetec* (SEQ ID NO: 9) 

AAG GTT AAG AAA GAG AAA GCT TCC ACA GAA ACT GAA TGC TAG TCTGTTTTCTTICTTTAATTTTT 
TITTITAACAGTC^^ 

tccct (SEQ 10 NO: 8) 
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